Celestin Apprentice 7

home *** CD-ROM | disk | FTP | other *** search

/ Celestin Apprentice 7 / Apprentice-Release7.iso / Environments / PowerLisp 2.01 / Supplemental Documentation / Documentation / Chapter 22. Input & Output < prev next >

Wrap

Text File | 1995-03-27 | 242.7 KB | 5,306 lines | [TEXT/ROSA]

Common Lisp the Language, 2nd Edition ------------------------------------------------------------------------------- 22. Input/Output Common Lisp provides a rich set of facilities for performing input/output. All input/output operations are performed on streams of various kinds. This chapter is devoted to stream data transfer operations. Streams are discussed in chapter 21, and ways of manipulating files through streams are discussed in chapter 23. While there is provision for reading and writing binary data, most of the I/O operations in Common Lisp read or write characters. There are simple primitives for reading and writing single characters or lines of data. The format function can perform complex formatting of output data, directed by a control string in manner similar to a Fortran FORMAT statement or a PL/I PUT EDIT statement. The most useful I/O operations, however, read and write printed representations of arbitrary Lisp objects. ------------------------------------------------------------------------------- * Printed Representation of Lisp Objects o What the Read Function Accepts o Parsing of Numbers and Symbols o Macro Characters o Standard Dispatching Macro Character Syntax o The Readtable o What the Print Function Produces * Input Functions o Input from Character Streams o Input from Binary Streams * Output Functions o Output to Character Streams o Output to Binary Streams o Formatted Output to Character Streams * Querying the User ------------------------------------------------------------------------------- 22.1. Printed Representation of Lisp Objects Lisp objects in general are not text strings but complex data structures. They have very different properties from text strings as a consequence of their internal representation. However, to make it possible to get at and talk about Lisp objects, Lisp provides a representation of most objects in the form of printed text; this is called the printed representation, which is used for input/output purposes and in the examples throughout this book. Functions such as print take a Lisp object and send the characters of its printed representation to a stream. The collection of routines that does this is known as the (Lisp) printer. The read function takes characters from a stream, interprets them as a printed representation of a Lisp object, builds that object, and returns it; the collection of routines that does this is called the (Lisp) reader. Ideally, one could print a Lisp object and then read the printed representation back in, and so obtain the same identical object. In practice this is difficult and for some purposes not even desirable. Instead, reading a printed representation produces an object that is (with obscure technical exceptions) equal to the originally printed object. Most Lisp objects have more than one possible printed representation. For example, the integer twenty-seven can be written in any of these ways: 27 27. #o33 #x1B #b11011 #.(* 3 3 3) 81/3 A list of two symbols A and B can be printed in many ways: (A B) (a b) ( a b ) ( A |B|) (| A| B The last example, which is spread over three lines, may be ugly, but it is legitimate. In general, wherever whitespace is permissible in a printed representation, any number of spaces and newlines may appear. When print produces a printed representation, it must choose arbitrarily from among many possible printed representations. It attempts to choose one that is readable. There are a number of global variables that can be used to control the actions of print, and a number of different printing functions. This section describes in detail what is the standard printed representation for any Lisp object and also describes how read operates. ------------------------------------------------------------------------------- * What the Read Function Accepts * Parsing of Numbers and Symbols * Macro Characters * Standard Dispatching Macro Character Syntax * The Readtable * What the Print Function Produces ------------------------------------------------------------------------------- 22.1.1. What the Read Function Accepts The purpose of the Lisp reader is to accept characters, interpret them as the printed representation of a Lisp object, and construct and return such an object. The reader cannot accept everything that the printer produces; for example, the printed representations of compiled code objects cannot be read in. However, the reader has many features that are not used by the output of the printer at all, such as comments, alternative representations, and convenient abbreviations for frequently used but unwieldy constructs. The reader is also parameterized in such a way that it can be used as a lexical analyzer for a more general user-written parser. The reader is organized as a recursive-descent parser. Broadly speaking, the reader operates by reading a character from the input stream and treating it in one of three ways. Whitespace characters serve as separators but are otherwise ignored. Constituent and escape characters are accumulated to make a token, which is then interpreted as a number or symbol. Macro characters trigger the invocation of functions (possibly user-supplied) that can perform arbitrary parsing actions, including recursive invocation of the reader. More precisely, when the reader is invoked, it reads a single character from the input stream and dispatches according to the syntactic type of that character. Every character that can appear in the input stream must be of exactly one of the following kinds: illegal, whitespace, constituent, single escape, multiple escape, or macro. Macro characters are further divided into the types terminating and non-terminating (of tokens). (Note that macro characters have nothing whatever to do with macros in their operation. There is a superficial similarity in that macros allow the user to extend the syntax of Common Lisp at the level of forms, while macro characters allow the user to extend the syntax at the level of characters.) Constituents additionally have one or more attributes, the most important of which is alphabetic; these attributes are discussed further in section 22.1.2. The parsing of Common Lisp expressions is discussed in terms of these syntactic character types because the types of individual characters are not fixed but may be altered by the user (see set-syntax-from-char and set-macro-character). The characters of the standard character set initially have the syntactic types shown in table 22-1. Note that the brackets, braces, question mark, and exclamation point (that is, [, ], {, }, ?, and !) are normally defined to be constituents, but they are not used for any purpose in standard Common Lisp syntax and do not occur in the names of built-in Common Lisp functions or variables. These characters are explicitly reserved to the user. The primary intent is that they be used as macro characters; but a user might choose, for example, to make ! be a single escape character (as it is in Portable Standard Lisp). ---------------------------------------------------------------- Table 22-1: Standard Character Syntax Types <tab> whitespace <page> whitespace <newline> whitespace <space> whitespace @ constituent ` terminating macro ! constituent * A constituent a constituent " terminating macro B constituent b constituent # non-terminating macro C constituent c constituent $ constituent D constituent d constituent % constituent E constituent e constituent & constituent F constituent f constituent ' terminating macro G constituent g constituent ( terminating macro H constituent h constituent ) terminating macro I constituent i constituent * constituent J constituent j constituent + constituent K constituent k constituent , terminating macro L constituent l constituent - constituent M constituent m constituent . constituent N constituent n constituent / constituent O constituent o constituent 0 constituent P constituent p constituent 1 constituent Q constituent q constituent 2 constituent R constituent r constituent 3 constituent S constituent s constituent 4 constituent T constituent t constituent 5 constituent U constituent u constituent 6 constituent V constituent v constituent 7 constituent W constituent w constituent 8 constituent X constituent x constituent 9 constituent Y constituent y constituent : constituent Z constituent z constituent ; terminating macro [ constituent * { constituent * < constituent \ single escape | multiple escape = constituent ] constituent * } constituent * > constituent ^ constituent ~ constituent ? constituent * _ constituent <rubout> constituent <bkspace> constituent <return> whitespace <linefeed> whitespace The characters marked with an asterisk are initially constituents but are reserved to the user for use as macro characters or for any other desired purpose. ---------------------------------------------------------------- The algorithm performed by the Common Lisp reader is roughly as follows: 1. If at end of file, perform end-of-file processing (as specified by the caller of the read function). Otherwise, read one character from the input stream, call it x, and dispatch according to the syntactic type of x to one of steps 2 to 7. 2. If x is an illegal character, signal an error. 3. If x is a whitespace character, then discard it and go back to step 1. 4. If x is a macro character (at this point the distinction between terminating and non-terminating macro characters does not matter), then execute the function associated with that character. The function may return zero values or one value (see values). The macro-character function may of course read characters from the input stream; if it does, it will see those characters following the macro character. The function may even invoke the reader recursively. This is how the macro character ( constructs a list: by invoking the reader recursively to read the elements of the list. If one value is returned, then return that value as the result of the read operation; the algorithm is done. If zero values are returned, then go back to step 1. 5. If x is a single escape character (normally ), then read the next character and call it y (but if at end of file, signal an error instead). Ignore the usual syntax of y and pretend it is a constituent whose only attribute is alphabetic. [old_change_begin] (If y is a lowercase character, leave it alone; do not replace it with the corresponding uppercase character.) [old_change_end] [change_begin] For the purposes of readtable-case, y is not replaceable. [change_end] Use y to begin a token, and go to step 8. 6. If x is a multiple escape character (normally |), then begin a token (initially containing no characters) and go to step 9. 7. If x is a constituent character, then it begins an extended token. After the entire token is read in, it will be interpreted either as representing a Lisp object such as a symbol or number (in which case that object is returned as the result of the read operation), or as being of illegal syntax (in which case an error is signaled). [old_change_begin] If x is a lowercase character, replace it with the corresponding uppercase character. [old_change_end] [change_begin] X3J13 voted in June 1989 (READ-CASE-SENSITIVITY) to introduce readtable-case. Consequently, the preceding sentence should be ignored. The case of x should not be altered; instead, x should be regarded as replaceable. [change_end] Use x to begin a token, and go on to step 8. 8. (At this point a token is being accumulated, and an even number of multiple escape characters have been encountered.) If at end of file, go to step 10. Otherwise, read a character (call it y), and perform one of the following actions according to its syntactic type: o If y is a constituent or non-terminating macro, then do the following. [old_change_begin] If y is a lowercase character, replace it with the corresponding uppercase character. [old_change_end] [change_begin] X3J13 voted in June 1989 (READ-CASE-SENSITIVITY) to introduce readtable-case. Consequently, the preceding sentence should be ignored. The case of y should not be altered; instead, y should be regarded as replaceable. Append y to the token being built, and repeat step 8. [change_end] o If y is a single escape character, then read the next character and call it z (but if at end of file, signal an error instead). Ignore the usual syntax of z and pretend it is a constituent whose only attribute is alphabetic. [old_change_begin] (If z is a lowercase character, leave it alone; do not replace it with the corresponding uppercase character.) [old_change_end] [change_begin] For the purposes of readtable-case, z is not replaceable. [change_end] Append z to the token being built, and repeat step 8. o If y is a multiple escape character, then go to step 9. o If y is an illegal character, signal an error. o If y is a terminating macro character, it terminates the token. First ``unread'' the character y (see unread-char), then go to step 10. o If y is a whitespace character, it terminates the token. First ``unread'' y if appropriate (see read-preserving-whitespace), then go to step 10. 9. (At this point a token is being accumulated, and an odd number of multiple escape characters have been encountered.) If at end of file, signal an error. Otherwise, read a character (call it y), and perform one of the following actions according to its syntactic type: o If y is a constituent, macro, or whitespace character, then ignore the usual syntax of that character and pretend it is a constituent whose only attribute is alphabetic. [old_change_begin] (If y is a lowercase character, leave it alone; do not replace it with the corresponding uppercase character.) [old_change_end] [change_begin] For the purposes of readtable-case, y is not replaceable. [change_end] Append y to the token being built, and repeat step 9. o If y is a single escape character, then read the next character and call it z (but if at end of file, signal an error instead). Ignore the usual syntax of z and pretend it is a constituent whose only attribute is alphabetic. [old_change_begin] (If z is a lowercase character, leave it alone; do not replace it with the corresponding uppercase character.) [old_change_end] [change_begin] For the purposes of readtable-case, z is not replaceable. [change_end] Append z to the token being built, and repeat step 9. o If y is a multiple escape character, then go to step 8. o If y is an illegal character, signal an error. 10. An entire token has been accumulated. [change_begin] X3J13 voted in June 1989 (READ-CASE-SENSITIVITY) to introduce readtable-case. If the accumulated token is to be interpreted as a symbol, any case conversion of replaceable characters should be performed at this point according to the value of the readtable-case slot of the current readtable (the value of *readtable*). [change_end] Interpret the token as representing a Lisp object and return that object as the result of the read operation, or signal an error if the token is not of legal syntax. [change_begin] X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to specify that implementation-defined attributes may be removed from the characters of a symbol token when constructing the print name. It is implementation-dependent which attributes are removed. [change_end] As a rule, a single escape character never stands for itself but always serves to cause the following character to be treated as a simple alphabetic character. A single escape character can be included in a token only if preceded by another single escape character. A multiple escape character also never stands for itself. The characters between a pair of multiple escape characters are all treated as simple alphabetic characters, except that single escape and multiple escape characters must nevertheless be preceded by a single escape character to be included. ------------------------------------------------------------------------------- Compatibility note: In MacLisp, the | character is implemented as a macro character that reads characters up to the next unescaped | and then makes a token; no characters are ever read beyond the second | of a matching pair. In Common Lisp, the second | does not terminate the token being read but merely reverts to the ordinary (rather than multiple-escape) mode of token accumulation. This results in some differences in the way certain character sequences are interpreted. For example, the sequence |foo||bar| would be read in MacLisp as two distinct tokens, |foo| and |bar|, whereas in Common Lisp it would be treated as a single token equivalent to |foobar|. The sequence |foo|bar|baz| would be read in MacLisp as three distinct tokens, |foo|, bar, and |baz|, whereas in Common Lisp it would be treated as a single token equivalent to |fooBARbaz|; note that the middle three lowercase letters are converted to uppercase letters as they do not fall within a matching pair of vertical bars. One reason for the different treatment of | in Common Lisp lies in the syntax for package-qualified symbol names. A sequence such as |foo:bar| ought to be interpreted as a symbol whose name is foo:bar; the colon should be treated as a simple alphabetic character because it lies within a pair of vertical bars. The symbol |bar| within the package |foo| can be notated not as |foo:bar| but as |foo|:|bar|; the colon can serve as a package marker because it falls outside the vertical bars, and yet the notation is treated as a single token thanks to the new rules adopted in Common Lisp. In MacLisp, the parentheses are treated as additional character types. In Common Lisp they are simply macro characters, as described in section 22.1.3. What MacLisp calls ``single character objects'' (tokens of type single) are not provided for explicitly in Common Lisp. They can be viewed as simply a kind of macro character. That is, the effect of (setsyntax '$ 'single nil) (setsyntax '% 'single nil) in MacLisp can be achieved in Common Lisp by (defun single-macro-character (stream char) (declare (ignore stream)) (intern (string char))) (set-macro-character '$ #'single-macro-character) (set-macro-character '% #'single-macro-character) ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- 22.1.2. Parsing of Numbers and Symbols When an extended token is read, it is interpreted as a number or symbol. In general, the token is interpreted as a number if it satisfies the syntax for numbers specified in table 22-2; this is discussed in more detail below. The characters of the extended token may serve various syntactic functions as shown in table 22-3, but it must be remembered that any character included in a token under the control of an escape character is treated as alphabetic rather than according to the attributes shown in the table. One consequence of this rule is that a whitespace, macro, or escape character will always be treated as alphabetic within an extended token because such a character cannot be included in an extended token except under the control of an escape character. To allow for extensions to the syntax of numbers, a syntax for potential numbers is defined in Common Lisp that is more general than the actual syntax for numbers. Any token that is not a potential number and does not consist entirely of dots will always be taken to be a symbol, now and in the future; programs may rely on this fact. Any token that is a potential number but does not fit the actual number syntax defined below is a reserved token and has an implementation-dependent interpretation; an implementation may signal an error, quietly treat the token as a symbol, or take some other action. Programmers should avoid the use of such reserved tokens. (A symbol whose name looks like a reserved token can always be written using one or more escape characters.) [change_begin] Just as bignum is the standard term used by Lisp implementors for very large integers, and flonum (rhymes with ``low hum'') refers to a floating-point number, the term potnum has been used widely as an abbreviation for ``potential number.'' ``Potnum'' rhymes with ``hot rum.'' [change_end] ---------------------------------------------------------------- Table 22-2: Actual Syntax of Numbers number ::= integer | ratio | floating-point-number integer ::= [sign] {digit}+ [decimal-point] ratio ::= [sign] {digit}+ / {digit}+ floating-point-number ::= [sign] {digit}* decimal-point {digit}+ [exponent] | [sign] {digit}+ [decimal-point {digit}*] exponent sign ::= + | - decimal-point ::= . digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 exponent ::= exponent-marker [sign] {digit}+ exponent-marker ::= e | s | f | d | l | E | S | F | D | L ---------------------------------------------------------------- ---------------------------------------------------------------- Table 22-3: Standard Constituent Character Attributes ! alphabetic <page> illegal <backspace> illegal " alphabetic * <return> illegal * <tab> illegal * # alphabetic * <space> illegal * <newline> illegal * $ alphabetic <rubout> illegal <linefeed> illegal * % alphabetic . alphabetic, dot, decimal point & alphabetic + alphabetic, plus sign ' alphabetic * - alphabetic, minus sign ( alphabetic * * alphabetic ) alphabetic * / alphabetic, ratio marker , alphabetic * @ alphabetic 0 alphadigit A, a alphadigit 1 alphadigit B, b alphadigit 2 alphadigit C, c alphadigit 3 alphadigit D, d alphadigit, double-float exponent marker 4 alphadigit E, e alphadigit, float exponent marker 5 alphadigit F, f alphadigit, single-float exponent marker 6 alphadigit G, g alphadigit 7 alphadigit H, h alphadigit 8 alphadigit I, i alphadigit 9 alphadigit J, j alphadigit : package marker K, k alphadigit ; alphabetic * L, l alphadigit, long-float exponent marker < alphabetic M, m alphadigit = alphabetic N, n alphadigit > alphabetic O, o alphadigit ? alphabetic P, p alphadigit [ alphabetic Q, q alphadigit \ alphabetic * R, r alphadigit ] alphabetic S, s alphadigit, short-float exponent marker ^ alphabetic T, t alphadigit _ alphabetic U, u alphadigit ` alphabetic * V, v alphadigit { alphabetic W, w alphadigit | alphabetic * X, x alphadigit } alphabetic Y, y alphadigit ~ alphabetic Z, z alphadigit ---------------------------------------------------------------- A token is a potential number if it satisfies the following requirements: * It consists entirely of digits, signs (+ or -), ratio markers (/), decimal points (.), extension characters (^ or _), and number markers. (A number marker is a letter. Whether a letter may be treated as a number marker depends on context, but no letter that is adjacent to another letter may ever be treated as a number marker. Floating-point exponent markers are instances of number markers.) * It contains at least one digit. (Letters may be considered to be digits, depending on the value of *read-base*, but only in tokens containing no decimal points.) * It begins with a digit, sign, decimal point, or extension character. * It does not end with a sign. As examples, the following tokens are potential numbers, but they are not actually numbers as defined below, and so are reserved tokens. (They do indicate some interesting possibilities for future extensions.) 1b5000 777777q 1.7J -3/4+6.7J 12/25/83 27^19 3^4/5 6//7 3.1.2.6 ^-43^ 3.141_592_653_589_793_238_4 -3.7+2.6i-6.17j+19.6k The following tokens are not potential numbers but are always treated as symbols: / /5 + 1+ 1- foo+ ab.cd _ ^ ^/- The following tokens are potential numbers if the value of *read-base* is 16 (an abnormal situation), but they are always treated as symbols if the value of *read-base* is 10 (the usual value): bad-face 25-dec-83 a/b fad_cafe f^ It is possible for there to be an ambiguity as to whether a letter should be treated as a digit or as a number marker. In such a case, the letter is always treated as a digit rather than as a number marker. Note that the printed representation for a potential number may not contain any escape characters. An escape character robs the following character of all syntactic qualities, forcing it to be strictly alphabetic and therefore unsuitable for use in a potential number. For example, all of the following representations are interpreted as symbols, not numbers: \256 25\64 1.0\E6 |100| 3\.14159 |3/4| 3\/4 5|| In each case, removing the escape character(s) would allow the token to be treated as a number. If a potential number can in fact be interpreted as a number according to the BNF syntax in table 22-2, then a number object of the appropriate type is constructed and returned. It should be noted that in a given implementation it may be that not all tokens conforming to the actual syntax for numbers can actually be converted into number objects. For example, specifying too large or too small an exponent for a floating-point number may make the number impossible to represent in the implementation. Similarly, a ratio with denominator zero (such as -35/000) cannot be represented in any implementation. In any such circumstance where a token with the syntax of a number cannot be converted to an internal number object, an error is signaled. (On the other hand, an error must not be signaled for specifying too many significant digits for a floating-point number; an appropriately truncated or rounded value should be produced.) There is an omission in the syntax of numbers as described in table 22-2, in that the syntax does not account for the possible use of letters as digits. The radix used for reading integers and ratios is normally decimal. However, this radix is actually determined by the value of the variable *read-base*, whose initial value is 10. *read-base* may take on any integral value between 2 and 36; let this value be n. Then a token x is interpreted as an integer or ratio in base n if it could be properly so interpreted in the syntax #nRx (see section 22.1.4). So, for example, if the value of *read-base* is 16, then the printed representation (a small face in a bad place) would be interpreted as if the following representation had been read with *read-base* set to 10: (10 small 64206 in 10 2989 place) because four of the seven tokens in the list can be interpreted as hexadecimal numbers. This facility is intended to be used in reading files of data that for some reason contain numbers not in decimal radix; it may also be used for reading programs written in Lisp dialects (such as MacLisp) whose default number radix is not decimal. Non-decimal constants in Common Lisp programs or portable Common Lisp data files should be written using #O, #X, #B, or #nR syntax. When *read-base* has a value greater than 10, an ambiguity is introduced into the actual syntax for numbers because a letter can serve as either a digit or an exponent marker; a simple example is 1E0 when the value of *read-base* is 16. The ambiguity is resolved in accordance with the general principle that interpretation as a digit is preferred to interpretation as a number marker. The consequence in this case is that if a token can be interpreted as either an integer or a floating-point number, then it is taken to be an integer. If a token consists solely of dots (with no escape characters), then an error is signaled, except in one circumstance: if the token is a single dot and occurs in a situation appropriate to ``dotted list'' syntax, then it is accepted as a part of such syntax. Signaling an error catches not only misplaced dots in dotted list syntax but also lists that were truncated by *print-length* cutoff, because such lists end with a three-dot sequence (...). Examples: (a . b) ;A dotted pair of a and b (a.b) ;A list of one element, the symbol named a.b (a. b) ;A list of two elements a. and b (a .b) ;A list of two elements a and .b (a . b) ;A list of three elements a, ., and b (a |.| b) ;A list of three elements a, ., and b (a ... b) ;A list of three elements a, ..., and b (a |...| b) ;A list of three elements a, ..., and b (a b . c) ;A dotted list of a and b with c at the end .iot ;The symbol whose name is .iot (. b) ;Illegal; an error is signaled (a .) ;Illegal; an error is signaled (a .. b) ;Illegal; an error is signaled (a . . b) ;Illegal; an error is signaled (a b c ...) ;Illegal; an error is signaled In all other cases, the token is construed to be the name of a symbol. If there are any package markers (colons) in the token, they divide the token into pieces used to control the lookup and creation of the symbol. [old_change_begin] If there is a single package marker, and it occurs at the beginning of the token, then the token is interpreted as a keyword, that is, a symbol in the keyword package. The part of the token after the package marker must not have the syntax of a number. If there is a single package marker not at the beginning or end of the token, then it divides the token into two parts. The first part specifies a package; the second part is the name of an external symbol available in that package. Neither of the two parts may have the syntax of a number. If there are two adjacent package markers not at the beginning or end of the token, then they divide the token into two parts. The first part specifies a package; the second part is the name of a symbol within that package (possibly an internal symbol). Neither of the two parts may have the syntax of a number. [old_change_end] [change_begin] X3J13 voted in March 1988 (COLON-NUMBER) to clarify that, in the situations described in the preceding three paragraphs, the restriction on the syntax of the parts should be strengthened: none of the parts may have the syntax of even a potential number. Tokens such as :3600, :1/2, and editor:3.14159 were already ruled out; this clarification further declares that such tokens as :2^ 3, compiler:1.7J, and Christmas:12/25/83 are also in error and therefore should not be used in portable programs. Implementations may differ in their treatment of such package-marked potential numbers. [change_end] If a symbol token contains no package markers, then the entire token is the name of the symbol. The symbol is looked up in the default package, which is the value of the variable *package*. All other patterns of package markers, including the cases where there are more than two package markers or where a package marker appears at the end of the token, at present do not mean anything in Common Lisp (see chapter 11). It is therefore currently an error to use such patterns in a Common Lisp program. The valid patterns for tokens may be summarized as follows: nnnnn a number xxxxx a symbol in the current package :xxxxx a symbol in the keyword package ppppp:xxxxx an external symbol in the ppppp package ppppp::xxxxx a (possibly internal) symbol in the ppppp package where nnnnn has the syntax of a number, and xxxxx and ppppp do not have the syntax of a number. [change_begin] In accordance with the X3J13 decision noted above (COLON-NUMBER) , xxxxx and ppppp may not have the syntax of even a potential number. [change_end] [Variable] *read-base* The value of *read-base* controls the interpretation of tokens by read as being integers or ratios. Its value is the radix in which integers and ratios are to be read; the value may be any integer from 2 to 36 (inclusive) and is normally 10 (decimal radix). Its value affects only the reading of integers and ratios. In particular, floating-point numbers are always read in decimal radix. The value of *read-base* does not affect the radix for rational numbers whose radix is explicitly indicated by #O, #X, #B, or #nR syntax or by a trailing decimal point. Care should be taken when setting *read-base* to a value larger than 10, because tokens that would normally be interpreted as symbols may be interpreted as numbers instead. For example, with *read-base* set to 16 (hexadecimal radix), variables with names such as a, b, f, bad, and face will be treated by the reader as numbers (with decimal values 10, 11, 15, 2989, and 64206, respectively). The ability to alter the input radix is provided in Common Lisp primarily for the purpose of reading data files in special formats, rather than for the purpose of altering the default radix in which to read programs. The user is strongly encouraged to use #O, #X, #B, or #nR syntax when notating non-decimal constants in programs. ------------------------------------------------------------------------------- Compatibility note: This variable corresponds to the variable called ibase in MacLisp and to the function called radix in Interlisp. ------------------------------------------------------------------------------- [Variable] *read-suppress* When the value of *read-suppress* is nil, the Lisp reader operates normally. When it is not nil, then most of the interesting operations of the reader are suppressed; input characters are parsed, but much of what is read is not interpreted. The primary purpose of *read-suppress* is to support the operation of the read-time conditional constructs #+ and #- (see section 22.1.4). It is important for these constructs to be able to skip over the printed representation of a Lisp expression despite the possibility that the syntax of the skipped expression may not be entirely legal for the current implementation; this is because a primary application of #+ and #- is to allow the same program to be shared among several Lisp implementations despite small incompatibilities of syntax. A non-nil value of *read-suppress* has the following specific effects on the Common Lisp reader: * All extended tokens are completely uninterpreted. It matters not whether the token looks like a number, much less like a valid number; the pattern of package markers also does not matter. An extended token is simply discarded and treated as if it were nil; that is, reading an extended token when *read-suppress* is non-nil simply returns nil. (One consequence of this is that the error concerning improper dotted-list syntax will not be signaled.) * Any standard # macro-character construction that requires, permits, or disallows an infix numerical argument, such as #nR, will not enforce any constraint on the presence, absence, or value of such an argument. * The #\ construction always produces the value nil. It will not signal an error even if an unknown character name is seen. * Each of the #B, #O, #X, and #R constructions always scans over a following token and produces the value nil. It will not signal an error even if the token does not have the syntax of a rational number. * The #* construction always scans over a following token and produces the value nil. It will not signal an error even if the token does not consist solely of the characters 0 and 1. [old_change_begin] * Each of the #. and #, constructions reads the following form (in suppressed mode, of course) but does not evaluate it. The form is discarded and nil is produced. [old_change_end] [change_begin] X3J13 voted in January 1989 (SHARP-COMMA-CONFUSION) to remove #, from the language. [change_end] * Each of the #A, #S, and #: constructions reads the following form (in suppressed mode, of course) but does not interpret it in any way; it need not even be a list in the case of #S, or a symbol in the case of #:. The form is discarded and nil is produced. * The #= construction is totally ignored. It does not read a following form. It produces no object, but is treated as whitespace. * The ## construction always produces nil. Note that, no matter what the value of *read-suppress*, parentheses still continue to delimit (and construct) lists; the #( construction continues to delimit vectors; and comments, strings, and the quote and backquote constructions continue to be interpreted properly. Furthermore, such situations as '), #<, #), and #space continue to signal errors. In some cases, it may be appropriate for a user-written macro-character definition to check the value of *read-suppress* and to avoid certain computations or side effects if its value is not nil. [change_begin] [Variable] *read-eval* X3J13 voted in June 1989 (DATA-IO) to add a new reader control variable, *read-eval*, whose default value is t. If *read-eval* is false, the #. reader macro signals an error. Printing is also affected. If *read-eval* is false and *print-readably* is true, any print-object method that would otherwise output a #. reader macro must either output something different or signal an error of type print-not-readable. Binding *read-eval* to nil is useful when reading data that came from an untrusted source, such as a network or a user-supplied data file; it prevents the #. reader macro from being exploited as a ``Trojan horse'' to cause arbitrary forms to be evaluated. [change_end] ------------------------------------------------------------------------------- 22.1.3. Macro Characters If the reader encounters a macro character, then the function associated with that macro character is invoked and may produce an object to be returned. This function may read following characters in the stream in whatever syntax it likes (it may even call read recursively) and return the object represented by that syntax. Macro characters may or may not be recognized, of course, when read as part of other special syntaxes (such as for strings). The reader is therefore organized into two parts: the basic dispatch loop, which also distinguishes symbols and numbers, and the collection of macro characters. Any character can be reprogrammed as a macro character; this is a means by which the reader can be extended. The macro characters normally defined are as follows: ( The left-parenthesis character initiates reading of a pair or list. The function read is called recursively to read successive objects until a right parenthesis is found to be next in the input stream. A list of the objects read is returned. Thus the input sequence (a b c) is read as a list of three objects (the symbols a, b, and c). The right parenthesis need not immediately follow the printed representation of the last object; whitespace characters and comments may precede it. This can be useful for putting one object on each line and making it easy to add new objects: (defun traffic-light (color) (case color (green) (red (stop)) (amber (accelerate)) ;Insert more colors after this line )) It may be that no objects precede the right parenthesis, as in () or ( ); this reads as a list of zero objects (the empty list). If a token that is just a dot, not preceded by an escape character, is read after some object, then exactly one more object must follow the dot, possibly followed by whitespace, followed by the right parenthesis: (a b c . d) This means that the cdr of the last pair in the list is not nil, but rather the object whose representation followed the dot. The above example might have been the result of evaluating (cons 'a (cons 'b (cons 'c 'd))) => (a b c . d) Similarly, we have (cons 'znets 'wolq-zorbitan) => (znets . wolq-zorbitan) It is permissible for the object following the dot to be a list: (a b c d . (e f . (g))) is the same as (a b c d e f g) but a list following a dot is a non-standard form that print will never produce. ) The right-parenthesis character is part of various constructs (such as the syntax for lists) using the left-parenthesis character and is invalid except when used in such a construct. ' The single-quote (accent acute) character provides an abbreviation to make it easier to put constants in programs. The form 'foo reads the same as (quote foo): a list of the symbol quote and foo. ; Semicolon is used to write comments. The semicolon and all characters up to and including the next newline are ignored. Thus a comment can be put at the end of any line without affecting the reader. (A comment will terminate a token, but a newline would terminate the token anyway.) [change_begin] There is no functional difference between using one semicolon and using more than one, but the conventions shown here are in common use. [change_end] ;;;; COMMENT-EXAMPLE function. ;;; This function is useless except to demonstrate comments. ;;; (Actually, this example is much too cluttered with them.) (defun comment-example (x y) ;X is anything; Y is an a-list. (cond ((listp x) x) ;If X is a list, use that. ;; X is now not a list. There are two other cases. ((symbolp x) ;; Look up a symbol in the a-list. (cdr (assoc x y))) ;Remember, (cdr nil) is nil. ;; Do this when all else fails: (t (cons x ;Add x to a default list. '((lisp t) ;LISP is okay. (fortran nil) ;FORTRAN is not. (pl/i -500) ;Note that you can put comments in (ada .001) ; "data" as well as in "programs". ;; COBOL?? (teco -1.0e9)))))) In this example, comments may begin with one to four semicolons. o Single-semicolon comments are all aligned to the same column at the right; usually each comment concerns only the code it is next to. Occasionally a comment is long enough to occupy two or three lines; in this case, it is conventional to indent the continued lines of the comment one space (after the semicolon). o Double-semicolon comments are aligned to the level of indentation of the code. A space conventionally follows the two semicolons. Such comments usually describe the state of the program at that point or the code section that follows the comment. o Triple-semicolon comments are aligned to the left margin. They usually document whole programs or large code blocks. o Quadruple-semicolon comments usually indicate titles of whole programs or large code blocks. ------------------------------------------------------------------------------- Compatibility note: These conventions arose among users of MacLisp and have been found to be very useful. The conventions are conveniently exploited by certain software tools, such as the EMACS editor and the ATSIGN listing program developed at MIT. [change_begin] The ATSIGN listing program, alas, is no longer in use, but EMACS is widely available, especially the GNU EMACS implementation, which is available from the Free Software Foundation, 675 Massachusetts Avenue, Cambridge, Massachusetts 02139. Remember, GNU's Not UNIX. [change_end] ------------------------------------------------------------------------------- " The double quote character begins the printed representation of a string. Successive characters are read from the input stream and accumulated until another double quote is encountered. An exception to this occurs if a single escape character is seen; the escape character is discarded, the next character is accumulated, and accumulation continues. When a matching double quote is seen, all the accumulated characters up to but not including the matching double quote are made into a simple string and returned. ` The backquote (accent grave) character makes it easier to write programs to construct complex data structures by using a template. [change_begin] Notice of correction. In the first edition, the backquote character <`> appearing at the left margin above was inadvertently omitted. [change_end] As an example, writing `(cond ((numberp ,x) ,@y) (t (print ,x) ,@y)) is roughly equivalent to writing (list 'cond (cons (list 'numberp x) y) (list* 't (list 'print x) y)) The general idea is that the backquote is followed by a template, a picture of a data structure to be built. This template is copied, except that within the template commas can appear. Where a comma occurs, the form following the comma is to be evaluated to produce an object to be inserted at that point. Assume b has the value 3; then evaluating the form denoted by `(a b ,b ,(+ b 1) b) produces the result (a b 3 4 b). If a comma is immediately followed by an at-sign (@), then the form following the at-sign is evaluated to produce a list of objects. These objects are then ``spliced'' into place in the template. For example, if x has the value (a b c), then `(x ,x ,@x foo ,(cadr x) bar ,(cdr x) baz ,@(cdr x)) => (x (a b c) a b c foo b bar (b c) baz b c) The backquote syntax can be summarized formally as follows. For each of several situations in which backquote can be used, a possible interpretation of that situation as an equivalent form is given. Note that the form is equivalent only in the sense that when it is evaluated it will calculate the correct result. An implementation is quite free to interpret backquote in any way such that a backquoted form, when evaluated, will produce a result equal to that produced by the interpretation shown here. o `basic is the same as 'basic, that is, (quote basic), for any form basic that is not a list or a general vector. o `,form is the same as form, for any form, provided that the representation of form does not begin with ``@'' or ``.''. (A similar caveat holds for all occurrences of a form after a comma.) o `,@form is an error. o `(x1 x2 x3 ... xn . atom) may be interpreted to mean (append [x1] [x2] [x3] ... [xn] (quote atom)) where the brackets are used to indicate a transformation of an xj as follows: + [form] is interpreted as (list `form), which contains a backquoted form that must then be further interpreted. + [,form] is interpreted as (list form). + [,@form] is interpreted simply as form. o `(x1 x2 x3 ... xn) may be interpreted to mean the same as the backquoted form `(x1 x2 x3 ... xn . nil), thereby reducing it to the previous case. o `(x1 x2 x3 ... xn . ,form) may be interpreted to mean (append [x1] [x2] [x3] ... [xn] form) where the brackets indicate a transformation of an xj as described above. o `(x1 x2 x3 ... xn . ,@form) is an error. o `#(x1 x2 x3 ... xn) may be interpreted to mean (apply #'vector `(x1 x2 x3 ... xn)) No other uses of comma are permitted; in particular, it may not appear within the #A or #S syntax. Anywhere ``,@'' may be used, the syntax ``,.'' may be used instead to indicate that it is permissible to destroy the list produced by the form following the ``,.''; this may permit more efficient code, using nconc instead of append, for example. If the backquote syntax is nested, the innermost backquoted form should be expanded first. This means that if several commas occur in a row, the leftmost one belongs to the innermost backquote. Once again, it is emphasized that an implementation is free to interpret a backquoted form as any form that, when evaluated, will produce a result that is equal to the result implied by the above definition. In particular, no guarantees are made as to whether the constructed copy of the template will or will not share list structure with the template itself. As an example, the above definition implies that `((,a b) ,c ,@d) will be interpreted as if it were (append (list (append (list a) (list 'b) 'nil)) (list c) d 'nil) but it could also be legitimately interpreted to mean any of the following. (append (list (append (list a) (list 'b))) (list c) d) (append (list (append (list a) '(b))) (list c) d) (append (list (cons a '(b))) (list c) d) (list* (cons a '(b)) c d) (list* (cons a (list 'b)) c d) (list* (cons a '(b)) c (copy-list d)) (There is no good reason why copy-list should be performed, but it is not prohibited.) [change_begin] Some users complain that backquote syntax is difficult to read, especially when it is nested. I agree that it can get complicated, but in some situations (such as writing macros that expand into definitions for other macros) such complexity is to be expected, and the alternative is much worse. After I gained some experience in writing nested backquote forms, I found that I was not stopping to analyze the various patterns of nested backquotes and interleaved commas and quotes; instead, I was recognizing standard idioms wholesale, in the same manner that I recognize cadar as the primitive for ``extract the lambda-list from the form ((lambda ...) ...))'' without stopping to analyze it into ``car of cdr of car.'' For example, ,x within a doubly-nested backquote form means ``the value of x available during the second evaluation will appear here once the form has been twice evaluated,'' whereas ,',x means ``the value of x available during the first evaluation will appear here once the form has been twice evaluated'' and ,,x means ``the value of the value of x will appear here.'' See appendix C for a systematic set of examples of the use of nested backquotes. [change_end] , The comma character is part of the backquote syntax and is invalid if used other than inside the body of a backquote construction as described above. # This is a dispatching macro character. It reads an optional digit string and then one more character, and uses that character to select a function to run as a macro-character function. The # character also happens to be a non-terminating macro character. This is completely independent of the fact that it is a dispatching macro character; it is a coincidence that the only standard dispatching macro character in Common Lisp is also the only standard non-terminating macro character. See the next section for predefined # macro-character constructions. ------------------------------------------------------------------------------- 22.1.4. Standard Dispatching Macro Character Syntax The standard syntax includes forms introduced by the # character. These take the general form of a #, a second character that identifies the syntax, and following arguments in some form. If the second character is a letter, then case is not important; #O and #o are considered to be equivalent, for example. Certain # forms allow an unsigned decimal number to appear between the # and the second character; some other forms even require it. Those forms that do not explicitly permit such a number to appear forbid it. ---------------------------------------------------------------- Table 22-4: Standard # Macro Character Syntax #! undefined * #<backspace> signals error #" undefined #<tab> signals error ## reference to #= label #<newline> signals error #$ undefined #<linefeed> signals error #% undefined #<page> signals error #& undefined #<return> signals error #' function abbreviation #<space> signals error #( simple vector #+ read-time conditional #) signals error #- read-time conditional #* bit-vector #. read-time evaluation #, load-time evaluation #/ undefined #0 used for infix arguments #A, #a array #1 used for infix arguments #B, #b binary rational #2 used for infix arguments #C, #c complex number #3 used for infix arguments #D, #d undefined #4 used for infix arguments #E, #e undefined #5 used for infix arguments #F, #f undefined #6 used for infix arguments #G, #g undefined #7 used for infix arguments #H, #h undefined #8 used for infix arguments #I, #i undefined #9 used for infix arguments #J, #j undefined #: uninterned symbol #K, #k undefined #; undefined #L, #l undefined #< signals error #M, #m undefined #= label following object #N, #n undefined #> undefined #O, #o octal rational #? undefined * #P, #p pathname #@ undefined #Q, #q undefined #[ undefined * #R, #r radix-n rational #\ character object #S, #s structure #] undefined * #T, #t undefined #^ undefined #U, #u undefined #_ undefined #V, #v undefined #` undefined #W, #w undefined #{ undefined * #X, #x hexadecimal rational #| balanced comment #Y, #y undefined #} undefined * #Z, #z undefined #~ undefined #<rubout> undefined The combinations marked by an asterisk are explicitly reserved to the user and will never be defined by Common Lisp. [change_begin] X3J13 voted in June 1989 (PATHNAME-PRINT-READ) to specify #P and #p (undefined in the first edition). [change_end] ---------------------------------------------------------------- The currently defined # constructs are described below and summarized in table 22-4; more are likely to be added in the future. However, the constructs #!, #?, #[, #], #{, and #} are explicitly reserved for the user and will never be defined by the Common Lisp standard. #\ #\x reads in as a character object that represents the character x. Also, #\name reads in as the character object whose name is name. Note that the backslash allows this construct to be parsed easily by EMACS-like editors. In the single-character case, the character x must be followed by a non-constituent character, lest a name appear to follow the #\. A good model of what happens is that after #\ is read, the reader backs up over the and then reads an extended token, treating the initial as an escape character (whether it really is or not in the current readtable). Uppercase and lowercase letters are distinguished after #\; #\A and #\a denote different character objects. Any character works after #\, even those that are normally special to read, such as parentheses. Non-printing characters may be used after #\, although for them names are generally preferred. #\name reads in as a character object whose name is name (actually, whose name is (string-upcase name); therefore the syntax is case-insensitive). The name should have the syntax of a symbol. The following names are standard across all implementations: newline The character that represents the division between lines space The space or blank character The following names are semi-standard; if an implementation supports them, they should be used for the described characters and no others. rubout The rubout or delete character. page The form-feed or page-separator character tab The tabulate character backspace The backspace character return The carriage return character linefeed The line-feed character In some implementations, one or more of these characters might be a synonym for a standard character; the #\Linefeed character might be the same as #\Newline, for example. When the Lisp printer types out the name of a special character, it uses the same table as the #\ reader; therefore any character name you see typed out is acceptable as input (in that implementation). Standard names are always preferred over non-standard names for printing. The following convention is used in implementations that support non-zero bits attributes for character objects. If a name after #\ is longer than one character and has a hyphen in it, then it may be split into the two parts preceding and following the first hyphen; the first part (actually, string-upcase of the first part) may then be interpreted as the name or initial of a bit, and the second part as the name of the character (which may in turn contain a hyphen and be subject to further splitting). For example: #\Control-Space #\Control-Meta-Tab #\C-M-Return #\H-S-M-C-Rubout If the character name consists of a single character, then that character is used. Another may be necessary to quote the character. #\Control-% #\Control-Meta-\" #\Control-\a #\Meta-> [old_change_begin] If an unsigned decimal integer appears between the # and , it is interpreted as a font number, to become the font attribute of the character object (see char-font). [old_change_end] [change_begin] X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to replace the notion of bits and font attributes with that of implementation-defined attributes. Presumably this eliminates the portable use of this syntax for font information, although the vote did not address this question directly. [change_end] #' #'foo is an abbreviation for (function foo). foo may be the printed representation of any Lisp object. This abbreviation may be remembered by analogy with the ' macro character, since the function and quote special forms are similar in form. #( A series of representations of objects enclosed by #( and ) is read as a simple vector of those objects. This is analogous to the notation for lists. If an unsigned decimal integer appears between the # and (, it specifies explicitly the length of the vector. In that case, it is an error if too many objects are specified before the closing ), and if too few are specified, the last object (it is an error if there are none in this case) is used to fill all remaining elements of the vector. For example, #(a b c c c c) #6(a b c c c c) #6(a b c) #6(a b c c) all mean the same thing: a vector of length 6 with elements a, b, and four instances of c. The notation #() denotes an empty vector, as does #0() (which is legitimate because it is not the case that too few elements are specified). #* A series of binary digits (0 and 1) preceded by #* is read as a simple bit-vector containing those bits, the leftmost bit in the series being bit 0 of the bit-vector. If an unsigned decimal integer appears between the # and *, it specifies explicitly the length of the vector. In that case, it is an error if too many bits are specified, and if too few are specified the last one (it is an error if there are none in this case) is used to fill all remaining elements of the bit-vector. For example, #*101111 #6*101111 #6*101 #6*1011 all mean the same thing: a vector of length 6 with elements 1, 0, 1, 1, 1, and 1. The notation #* denotes an empty bit-vector, as does #0* (which is legitimate because it is not the case that too few elements are specified). [change_begin] Compare this to #B, used for expressing integers in binary notation. [change_end] #: #:foo requires foo to have the syntax of an unqualified symbol name (no embedded colons). It denotes an uninterned symbol whose name is foo. Every time this syntax is encountered, a different uninterned symbol is created. If it is necessary to refer to the same uninterned symbol more than once in the same expression, the #= syntax may be useful. #. #.foo is read as the object resulting from the evaluation of the Lisp object represented by foo, which may be the printed representation of any Lisp object. The evaluation is done during the read process, when the #. construct is encountered. [change_begin] X3J13 voted in June 1989 (DATA-IO) to add a new reader control variable, *read-eval*. If it is true, the #. reader macro behaves as described above; if it is false, the #. reader macro signals an error. [change_end] The #. syntax therefore performs a read-time evaluation of foo. By contrast, #, (see below) performs a load-time evaluation. Both #. and #, allow you to include, in an expression being read, an object that does not have a convenient printed representation; instead of writing a representation for the object, you write an expression that will compute the object. [old_change_begin] #, #,foo is read as the object resulting from the evaluation of the Lisp object represented by foo, which may be the printed representation of any Lisp object. The evaluation is done during the read process, unless the compiler is doing the reading, in which case it is arranged that foo will be evaluated when the file of compiled code is loaded. The #, syntax therefore performs a load-time evaluation of foo. By contrast, #. (see above) performs a read-time evaluation. In a sense, #, is like specifying (eval load) to eval-when, whereas #. is more like specifying (eval compile). It makes no difference when loading interpreted code; when code is to be compiled, however, #. specifies compile-time evaluation and #, specifies load-time evaluation. [old_change_end] [change_begin] X3J13 voted in January 1989 (SHARP-COMMA-CONFUSION) to remove #, from the language. X3J13 noted that the first edition failed to make it clear that #, can be meaningful only within quoted forms. All sorts of anomalies can arise, including inconsistencies between the interpreter and compiler, if #, is not properly restricted. See load-time-eval. [change_end] #B #brational reads rational in binary (radix 2). For example, #B1101 == 13, and #b101/11 == 5/3. [change_begin] Compare this to #*, used for expressing bit-vectors in binary notation. [change_end] #O #orational reads rational in octal (radix 8). For example, #o37/15 == 31/13, and #o777 == 511. #X #xrational reads rational in hexadecimal (radix 16). The digits above 9 are the letters A through F (the lowercase letters a through f are also acceptable). For example, #xF00 == 3840. #nR #radixrrational reads rational in radix radix. radix must consist of only digits, and it is read in decimal; its value must be between 2 and 36 (inclusive). For example, #3r102 is another way of writing 11, and #11R32 is another way of writing 35. For radices larger than 10, letters of the alphabet are used in order for the digits after 9. #nA The syntax #nAobject constructs an n-dimensional array, using object as the value of the :initial-contents argument to make-array. The value of n makes a difference: #2A((0 1 5) (foo 2 (hot dog))), for example, represents a 2-by-3 matrix: 0 1 5 foo 2 (hot dog) In contrast, #1A((0 1 5) (foo 2 (hot dog))) represents a length-2 array whose elements are lists: (0 1 5) (foo 2 (hot dog)) Furthermore, #0A((0 1 5) (foo 2 (hot dog))) represents a zero-dimensional array whose sole element is a list: ((0 1 5) (foo 2 (hot dog))) Similarly, #0Afoo (or, more readably, #0A foo) represents a zero-dimensional array whose sole element is the symbol foo. The expression #1Afoo would not be legal because foo is not a sequence. #S The syntax #s(name slot1 value1 slot2 value2 ...) denotes a structure. This is legal only if name is the name of a structure already defined by defstruct and if the structure has a standard constructor macro, which it normally will. Let cm stand for the name of this constructor macro; then this syntax is equivalent to #.(cm keyword1 'value1 keyword2 'value2 ...) where each keywordj is the result of computing (intern (string slotj) 'keyword) (This computation is made so that one need not write a colon in front of every slot name.) The net effect is that the constructor macro is called with the specified slots having the specified values (note that one does not write quote marks in the #S syntax). Whatever object the constructor macro returns is returned by the #S syntax. [change_begin] #P X3J13 voted in June 1989 (PATHNAME-PRINT-READ) to define the reader syntax #p"..." to be equivalent to #.(parse-namestring "..."). Presumably this was meant to be taken descriptively and not literally. I would think, for example, that the committee did not wish to quibble over the package in which the name parse-namestring was to be read. Similarly, I would presume that the #p syntax operates normally rather than signaling an error when *read-eval* is false. I interpret the intent of the vote to be that #p reads a following form, which should be a string, that is then converted to a pathname as if by application of the standard function parse-namestring. [change_end] #n= The syntax #n=object reads as whatever Lisp object has object as its printed representation. However, that object is labelled by n, a required unsigned decimal integer, for possible reference by the syntax #n# (below). The scope of the label is the expression being read by the outermost call to read. Within this expression the same label may not appear twice. #n# The syntax #n#, where n is a required unsigned decimal integer, serves as a reference to some object labelled by #n=; that is, #n# represents a pointer to the same identical (eq) object labelled by #n=. This permits notation of structures with shared or circular substructure. For example, a structure created in the variable y by this code: (setq x (list 'p 'q)) (setq y (list (list 'a 'b) x 'foo x)) (rplacd (last y) (cdr y)) could be represented in this way: ((a b) . #1=(#2=(p q) foo #2# . #1#)) Without this notation, but with *print-length* set to 10, the structure would print in this way: ((a b) (p q) foo (p q) (p q) foo (p q) (p q) foo (p q) ...) A reference #n# may occur only after a label #n=; forward references are not permitted. In addition, the reference may not appear as the labelled object itself (that is, one may not write #n= #n#), because the object labelled by #n= is not well defined in this case. #+ The #+ syntax provides a read-time conditionalization facility; the syntax is #+feature form If feature is ``true,'' then this syntax represents a Lisp object whose printed representation is form. If feature is ``false,'' then this syntax is effectively whitespace; it is as if it did not appear. The feature should be the printed representation of a symbol or list. If feature is a symbol, then it is true if and only if it is a member of the list that is the value of the global variable *features*. ------------------------------------------------------------------------------- Compatibility note: MacLisp uses the status special form for this purpose, and Lisp Machine Lisp duplicates status essentially only for the sake of (status features). The use of a variable allows one to bind the features list, when compiling, for example. ------------------------------------------------------------------------------- Otherwise, feature should be a Boolean expression composed of and, or, and not operators on (recursive) feature expressions. For example, suppose that in implementation A the features spice and perq are true, and in implementation B the feature lispm is true. Then the expressions on the left below are read the same as those on the right in implementation A: (cons #+spice "Spice" #+lispm "Lispm" x) (cons "Spice" x) (setq a '(1 2 #+perq 43 #+(not perq) 27)) (setq a '(1 2 43)) (let ((a 3) #+(or spice lispm) (b 3)) (let ((a 3) (b 3)) (foo a)) (foo a)) (cons a #+perq #-perq b c) (cons a c) In implementation B, however, they are read in this way: (cons #+spice "Spice" #+lispm "Lispm" x) (cons "Lispm" x) (setq a '(1 2 #+perq 43 #+(not perq) 27)) (setq a '(1 2 27)) (let ((a 3) #+(or spice lispm) (b 3)) (let ((a 3) (b 3)) (foo a)) (foo a)) (cons a #+perq #-perq b c) (cons a c) The #+ construction must be used judiciously if unreadable code is not to result. The user should make a careful choice between read-time conditionalization and run-time conditionalization. [old_change_begin] The #+ syntax operates by first reading the feature specification and then skipping over the form if the feature is ``false.'' This skipping of a form is a bit tricky because of the possibility of user-defined macro characters and side effects caused by the #. and #, constructions. It is accomplished by binding the variable *read-suppress* to a non-nil value and then calling the read function. See the description of *read-suppress* for the details of this operation. [old_change_end] [change_begin] X3J13 voted in January 1989 (SHARP-COMMA-CONFUSION) to remove #, from the language. X3J13 voted in March 1988 (SHARPSIGN-PLUS-MINUS-PACKAGE) to specify that the keyword package is the default package during the reading of a feature specification. Thus #+spice means the same thing as #+:spice, and #+(or spice lispm) means the same thing as #+(or :spice :lispm). Symbols in other packages may be used as feature names, but one must use an explicit package prefix to cite one after #+. [change_end] #- #-feature form is equivalent to #+(not feature) form. #| #|...|# is treated as a comment by the reader, just as everything from a semicolon to the next newline is treated as a comment. Anything may appear in the comment, except that it must be balanced with respect to other occurrences of #| and |#. Except for this nesting rule, the comment may contain any characters whatsoever. The main purpose of this construct is to allow ``commenting out'' of blocks of code or data. The balancing rule allows such blocks to contain pieces already so commented out. In this respect the #|...|# syntax of Common Lisp differs from the /*...*/ comment syntax used by PL/I and C. #< This is not legal reader syntax. It is conventionally used in the printed representation of objects that cannot be read back in. Attempting to read a #< will cause an error. (More precisely, it is legal syntax, but the macro-character function for #< signals an error.) [change_begin] The usual convention for printing unreadable data objects is to print some identifying information (the internal machine address of the object, if nothing else) preceded by #< and followed by >. X3J13 voted in June 1989 (DATA-IO) to add print-unreadable-object, a macro that prints an object using #<...> syntax and also takes care of checking the variable *print-readably*. [change_end] #<space>, #<tab>, #<newline>, #<page>, #<return> A # followed by a whitespace character is not legal reader syntax. This prevents abbreviated forms produced via *print-level* cutoff from reading in again, as a safeguard against losing information. (More precisely, this is legal syntax, but the macro-character function for it signals an error.) #) This is not legal reader syntax. This prevents abbreviated forms produced via *print-level* cutoff from reading in again, as a safeguard against losing information. (More precisely, this is legal syntax, but the macro-character function for it signals an error.) ------------------------------------------------------------------------------- 22.1.5. The Readtable Previous sections describe the standard syntax accepted by the read function. This section discusses the advanced topic of altering the standard syntax either to provide extended syntax for Lisp objects or to aid the writing of other parsers. There is a data structure called the readtable that is used to control the reader. It contains information about the syntax of each character equivalent to that in table 22-1. It is set up exactly as in table 22-1 to give the standard Common Lisp meanings to all the characters, but the user can change the meanings of characters to alter and customize the syntax of characters. It is also possible to have several readtables describing different syntaxes and to switch from one to another by binding the variable *readtable*. [old_change_begin] Even if an implementation supports characters with non-zero bits and font attributes, it need not (but may) allow for such characters to have syntax descriptions in the readtable. However, every character of type string-char must be represented in the readtable. [old_change_end] [change_begin] X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to remove the type string-char and to replace the bits and font attributes with the notion of implementation-defined attributes. If any implementation-defined attributes are supported, an implementation may (but need not) allow for such characters to have syntax descriptions in the readtable. Characters that do not have non-standard values for any implementation-defined attribute must be represented in the readtable. [change_end] [Variable] *readtable* The value of *readtable* is the current readtable. The initial value of this is a readtable set up for standard Common Lisp syntax. You can bind this variable to temporarily change the readtable being used. To program the reader for a different syntax, a set of functions are provided for manipulating readtables. Normally, you should begin with a copy of the standard Common Lisp readtable and then customize the individual characters within that copy. [Function] copy-readtable &optional from-readtable to-readtable A copy is made of from-readtable, which defaults to the current readtable (the value of the global variable *readtable*). If from-readtable is nil, then a copy of a standard Common Lisp readtable is made. For example, (setq *readtable* (copy-readtable nil)) will restore the input syntax to standard Common Lisp syntax, even if the original readtable has been clobbered (assuming it is not so badly clobbered that you cannot type in the above expression!). On the other hand, (setq *readtable* (copy-readtable)) will merely replace the current readtable with a copy of itself. If to-readtable is unsupplied or nil, a fresh copy is made. Otherwise, to-readtable must be a readtable, which is destructively copied into. [Function] readtablep object readtablep is true if its argument is a readtable, and otherwise is false. (readtablep x) == (typep x 'readtable) [Function] set-syntax-from-char to-char from-char &optional to-readtable from-readtable This makes the syntax of to-char in to-readtable be the same as the syntax of from-char in from-readtable. The to-readtable defaults to the current readtable (the value of the global variable *readtable*), and from-readtable defaults to nil, meaning to use the syntaxes from the standard Lisp readtable. [change_begin] X3J13 voted in January 1989 (ARGUMENTS-UNDERSPECIFIED) to clarify that the to-char and from-char must each be a character. [change_end] Only attributes as shown in table 22-1 are copied; moreover, if a macro character is copied, the macro definition function is copied also. However, attributes as shown in table 22-3 are not copied; they are ``hard-wired'' into the extended-token parser. For example, if the definition of S is copied to *, then * will become a constituent that is alphabetic but cannot be used as an exponent indicator for short-format floating-point number syntax. It works to copy a macro definition from a character such as " to another character; the standard definition for " looks for another character that is the same as the character that invoked it. It doesn't work to copy the definition of ( to {, for example; it can be done, but it lets one write lists in the form {a b c), not {a b c}, because the definition always looks for a closing parenthesis, not a closing brace. See the function read-delimited-list, which is useful in this connection. [change_begin] X3J13 voted in January 1989 (RETURN-VALUES-UNSPECIFIED) to specify that the set-syntax-from-char function returns t. [change_end] [Function] set-macro-character char function &optional non-terminating-p readtable get-macro-character char &optional readtable set-macro-character causes char to be a macro character that when seen by read causes function to be called. If non-terminating-p is not nil (it defaults to nil), then it will be a non-terminating macro character: it may be embedded within extended tokens. set-macro-character returns t. get-macro-character returns the function associated with char and, as a second value, returns the non-terminating-p flag; it returns nil if char does not have macro-character syntax. In each case, readtable defaults to the current readtable. [change_begin] X3J13 voted in January 1989 (GET-MACRO-CHARACTER-READTABLE) to specify that if nil is explicitly passed as the second argument to get-macro-character, then the standard readtable is used. This is consistent with the behavior of copy-readtable. [change_end] The function is called with two arguments, stream and char. The stream is the input stream, and char is the macro character itself. In the simplest case, function may return a Lisp object. This object is taken to be that whose printed representation was the macro character and any following characters read by the function. As an example, a plausible definition of the standard single quote character is: (defun single-quote-reader (stream char) (declare (ignore char)) (list 'quote (read stream t nil t))) (set-macro-character #\' #'single-quote-reader) (Note that t is specified for the recursive-p argument to read; see section 22.2.1.) The function reads an object following the single-quote and returns a list of the symbol quote and that object. The char argument is ignored. The function may choose instead to return zero values (for example, by using (values) as the return expression). In this case, the macro character and whatever it may have read contribute nothing to the object being read. As an example, here is a plausible definition for the standard semicolon (comment) character: (defun semicolon-reader (stream char) (declare (ignore char)) ;; First swallow the rest of the current input line. ;; End-of-file is acceptable for terminating the comment. (do () ((char= (read-char stream nil #\Newline t) #\Newline))) ;; Return zero values. (values)) (set-macro-character #\; #'semicolon-reader) (Note that t is specified for the recursive-p argument to read-char; see section 22.2.1.) The function should not have any side effects other than on the stream. Because of backtracking and restarting of the read operation, front ends (such as editors and rubout handlers) to the reader may cause function to be called repeatedly during the reading of a single expression in which the macro character only appears once. ------------------------------------------------------------------------------- Compatibility note: The ability to return either zero or one value is the closest Common Lisp macro characters come to the splicing macro characters of MacLisp or the splice macro characters of Interlisp. The Common Lisp definition does not allow the splicing of arbitrarily many values, but it does allow a macro-character function to decide after it is invoked whether or not to yield a value, an option not possible in MacLisp or Interlisp. MacLisp has nothing equivalent to non-terminating macro characters. The Interlisp equivalents of terminating and non-terminating macro characters are macro characters with the ALWAYS or FIRST option, respectively. Common Lisp has nothing equivalent to the Interlisp ALONE macro-character option. ------------------------------------------------------------------------------- [change_begin] Here is an example of a more elaborate set of read-macro characters that I used in the implementation of the original simulator for Connection Machine Lisp [44,57], a parallel dialect of Common Lisp. This simulator was used to gain experience with the language before freezing its design for full-scale implementation on a Connection Machine computer system. This example illustrates the typical manner in which a language designer can embed a new language within the syntactic and semantic framework of Lisp, saving the effort of designing an implementation from scratch. Connection Machine Lisp introduces a new data type called a xapping, which is simply an unordered set of ordered pairs of Lisp objects. The first element of each pair is called the index and the second element the value. We say that the xapping maps each index to its corresponding value. No two pairs of the same xapping may have the same (that is, eql) index. Xappings may be finite or infinite sets of pairs; only certain kinds of infinite xappings are required, and special representations are used for them. A finite xapping is notated by writing the pairs between braces, separated by whitespace. A pair is notated by writing the index and the value, separated by a right arrow (or an exclamation point if the host Common Lisp has no right-arrow character). ------------------------------------------------------------------------------- Remark: The original language design used the right arrow; the exclamation point was chosen to replace it on ASCII-only terminals because it is one of the six characters [ ] { } ! ? reserved by Common Lisp to the user. While preparing the TeX manuscript for this book I made a mistake in font selection and discovered that by an absolutely incredible coincidence the right arrow has the same numerical code (octal 41) within TeX fonts as the ASCII exclamation point. The result was that although the manuscript called for right arrows, exclamation points came out in the printed copy. Imagine my astonishment! ------------------------------------------------------------------------------- Here is an example of a xapping that maps three symbols to strings: {moe->"Oh, a wise guy, eh?" larry->"Hey, what's the idea?" curly->"Nyuk, nyuk, nyuk!"} For convenience there are certain abbreviated notations. If the index and value for a pair are the same object x, then instead of having to write ``x->x'' (or, worse yet, ``#43=x->#43#'') we may write simply x for the pair. If all pairs of a xapping are of this form, we call the xapping a xet. For example, the notation {baseball chess cricket curling bocce 43-man-squamish} is entirely equivalent in meaning to {baseball->baseball curling->curling cricket->cricket chess->chess bocce->bocce 43-man-squamish->43-man-squamish} namely a xet of symbols naming six sports. Another useful abbreviation covers the situation where the n pairs of a finite xapping are integers, collectively covering a range from zero to n-1. This kind of xapping is called a xector and may be notated by writing the values between brackets in ascending order of their indices. Thus [tinker evers chance] is merely an abbreviation for {tinker->0 evers->1 chance->2} There are two kinds of infinite xapping: constant and universal. A constant xapping {->z} maps every object to the same value z. The universal xapping {->} maps every object to itself and is therefore the xet of all Lisp objects, sometimes called simply the universe. Both kinds of infinite xet may be modified by explicitly writing exceptions. One kind of exception is simply a pair, which specifies the value for a particular index; the other kind of exception is simply k->indicating that the xapping does not have a pair with index k after all. Thus the notation {sky->blue grass->green idea->glass->->red} indicates a xapping that maps sky to blue, grass to green, and every other object except idea and glass to red. Note well that the presence or absence of whitespace on either side of an arrow is crucial to the correct interpretation of the notation. Here is the representation of a xapping as a structure: (defstruct (xapping (:print-function print-xapping) (:constructor xap (domain range &optional (default ':unknown defaultp) (infinite (and defaultp :constant)) (exceptions '())))) domain range default (infinite nil :type (member nil :constant :universal) exceptions) The explicit pairs are represented as two parallel lists, one of indexes (domain) and one of values (range). The default slot is the default value, relevant only if the infinite slot is :constant. The exceptions slot is a list of indices for which there are no values. (See the end of section 22.3.3 for the definition of print-xapping.) Here, then, is the code for reading xectors in bracket notation: (defun open-bracket-macro-char (stream macro-char) (declare (ignore macro-char)) (let ((range (read-delimited-list #\] stream t))) (xap (iota-list (length range)) range))) (set-macro-character #\[ #'open-bracket-macro-char) (set-macro-character #\] (get-macro-character #\) )) (defun iota-list (n) ;Return list of integers from 0 to n-1 (do ((j (- n 1) (- j 1)) (z '() (cons j z))) ((< j 0) z))) The code for reading xappings in the more general brace notation, with all the possibilities for xets (or individual xet pairs), infinite xappings, and exceptions, is a bit more complicated; it is shown in table 22-5. That code is used in conjunction with the initializations (set-macro-character #\{ #'open-brace-macro-char) (set-macro-character #\} (get-macro-character #\) )) ---------------------------------------------------------------- Table 22-5: Macro Character Definition for Xapping Syntax (defun open-brace-macro-char (s macro-char) (declare (ignore macro-char)) (do ((ch (peek-char t s t nil t) (peek-char t s t nil t)) (domain '()) (range '()) (exceptions '())) ((char= ch #\}) (read-char s t nil t) (construct-xapping (reverse domain) (reverse range))) (cond ((char= ch #\->) (read-char s t nil t) (let ((nextch (peek-char nil s t nil t))) (cond ((char= nextch #\}) (read-char s t nil t) (return (xap (reverse domain) (reverse range) nil :universal exceptions))) (t (let ((item (read s t nil t))) (cond ((char= (peek-char t s t nil t) #\}) (read-char s t nil t) (return (xap (reverse domain) (reverse range) item :constant exceptions))) (t (reader-error s "Default -> item must be last")))))))) (t (let ((item (read-preserving-whitespace s t nil t)) (nextch (peek-char nil s t nil t))) (cond ((char= nextch #\->) (read-char s t nil t) (cond ((member (peek-char nil s t nil t) '(#\Space #\Tab #\Newline)) (push item exceptions)) (t (push item domain) (push (read s t nil t) range)))) ((char= nch #\}) (read-char s t nil t) (push item domain) (push item range) (return (xap (reverse domain) (reverse range)))) (t (push item domain) (push item range)))))))) ---------------------------------------------------------------- [change_end] [Function] make-dispatch-macro-character char &optional non-terminating-p readtable This causes the character char to be a dispatching macro character in readtable (which defaults to the current readtable). If non-terminating-p is not nil (it defaults to nil), then it will be a non-terminating macro character: it may be embedded within extended tokens. make-dispatch-macro-character returns t. Initially every character in the dispatch table has a character-macro function that signals an error. Use set-dispatch-macro-character to define entries in the dispatch table. [change_begin] X3J13 voted in January 1989 (ARGUMENTS-UNDERSPECIFIED) to clarify that char must be a character. [change_end] [Function] set-dispatch-macro-character disp-char sub-char function &optional readtable get-dispatch-macro-character disp-char sub-char &optional readtable set-dispatch-macro-character causes function to be called when the disp-char followed by sub-char is read. The readtable defaults to the current readtable. The arguments and return values for function are the same as for normal macro characters except that function gets sub-char, not disp-char, as its second argument and also receives a third argument that is the non-negative integer whose decimal representation appeared between disp-char and sub-char, or nil if no decimal integer appeared there. The sub-char may not be one of the ten decimal digits; they are always reserved for specifying an infix integer argument. Moreover, if sub-char is a lowercase character (see lower-case-p), its uppercase equivalent is used instead. (This is how the rule is enforced that the case of a dispatch sub-character doesn't matter.) set-dispatch-macro-character returns t. get-dispatch-macro-character returns the macro-character function for sub-char under disp-char, or nil if there is no function associated with sub-char. If the sub-char is one of the ten decimal digits 0 1 2 3 4 5 6 7 8 9, get-dispatch-macro-character always returns nil. If sub-char is a lowercase character, its uppercase equivalent is used instead. [change_begin] X3J13 voted in January 1989 (GET-MACRO-CHARACTER-READTABLE) to specify that if nil is explicitly passed as the second argument to get-dispatch-macro-character, then the standard readtable is used. This is consistent with the behavior of copy-readtable. [change_end] For either function, an error is signaled if the specified disp-char is not in fact a dispatch character in the specified readtable. It is necessary to use make-dispatch-macro-character to set up the dispatch character before specifying its sub-characters. As an example, suppose one would like #$foo to be read as if it were (dollars foo). One might say: (defun |#$-reader| (stream subchar arg) (declare (ignore subchar arg)) (list 'dollars (read stream t nil t))) (set-dispatch-macro-character #\# #\$ #'|#$-reader|) ------------------------------------------------------------------------------- Compatibility note: This macro-character mechanism is different from those in MacLisp, Interlisp, and Lisp Machine Lisp. Recently Lisp systems have implemented very general readers, even readers so programmable that they can parse arbitrary compiled BNF grammars. Unfortunately, these readers can be complicated to use. This design is an attempt to make the reader as simple as possible to understand, use, and implement. Splicing macros have been eliminated; a recent informal poll indicates that no one uses them to produce other than zero or one value. The ability to access parts of the object preceding the macro character has been eliminated. The MacLisp single-character-object feature has been eliminated because it is seldom used and trivially obtainable by defining a macro. The user is encouraged to turn off most macro characters, turn others into single-character-object macros, and then use read purely as a lexical analyzer on top of which to build a parser. It is unnecessary, however, to cater to more complex lexical analysis or parsing than that needed for Common Lisp. ------------------------------------------------------------------------------- [change_begin] [Function] readtable-case readtable X3J13 voted in June 1989 (READ-CASE-SENSITIVITY) to introduce the function readtable-case to control the reader's interpretation of case. It provides access to a slot in a readtable, and may be used with setf to alter the state of that slot. The possible values for the slot are :upcase, :downcase, :preserve, and :invert; the readtable-case for the standard readtable is :upcase. Note that copy-readtable is required to copy the readtable-case slot along with all other readtable information. Once the reader has accumulated a token as described in section 22.1.1, if the token is a symbol, ``replaceable'' characters (unescaped uppercase or lowercase constituent characters) may be modified under the control of the readtable-case of the current readtable: * For :upcase, replaceable characters are converted to uppercase. (This was the behavior specified by the first edition.) * For :downcase, replaceable characters are converted to lowercase. * For :preserve, the cases of all characters remain unchanged. * For :invert, if all of the replaceable letters in the extended token are of the same case, they are all converted to the opposite case; otherwise the cases of all characters in that token remain unchanged. As an illustration, consider the following code. (let ((*readtable* (copy-readtable nil))) (format t "READTABLE-CASE Input Symbol-name~ ~%-----------------------------------~ ~%") (dolist (readtable-case '(:upcase :downcase :preserve :invert)) (setf (readtable-case *readtable*) readtable-case) (dolist (input '("ZEBRA" "Zebra" "zebra")) (format t ":~A~16T~A~24T~A~%" (string-upcase readtable-case) input (symbol-name (read-from-string input))))))) The output from this test code should be READTABLE-CASE Input Symbol-name ------------ :UPCASE ZEBRA ZEBRA :UPCASE Zebra ZEBRA :UPCASE zebra ZEBRA :DOWNCASE ZEBRA zebra :DOWNCASE Zebra zebra :DOWNCASE zebra zebra :PRESERVE ZEBRA ZEBRA :PRESERVE Zebra Zebra :PRESERVE zebra zebra :INVERT ZEBRA zebra :INVERT Zebra Zebra :INVERT zebra ZEBRA The readtable-case of the current readtable also affects the printing of symbols (see *print-case* and *print-escape*). [change_end] ------------------------------------------------------------------------------- 22.1.6. What the Print Function Produces The Common Lisp printer is controlled by a number of special variables. These are referred to in the following discussion and are fully documented at the end of this section. How an expression is printed depends on its data type, as described in the following paragraphs. Integers If appropriate, a radix specifier may be printed; see the variable *print-radix*. If an integer is negative, a minus sign is printed and then the absolute value of the integer is printed. Integers are printed in the radix specified by the variable *print-base* in the usual positional notation, most significant digit first. The number zero is represented by the single digit 0 and never has a sign. A decimal point may then be printed, depending on the value of *print-radix*. Ratios If appropriate, a radix specifier may be printed; see the variable *print-radix*. If the ratio is negative, a minus sign is printed. Then the absolute value of the numerator is printed, as for an integer; then a /; then the denominator. The numerator and denominator are both printed in the radix specified by the variable *print-base*; they are obtained as if by the numerator and denominator functions, and so ratios are always printed in reduced form (lowest terms). Floating-point numbers If the sign of the number (as determined by the function float-sign) is negative, then a minus sign is printed. Then the magnitude is printed in one of two ways. If the magnitude of the floating-point number is either zero or between (inclusive) and (exclusive), it may be printed as the integer part of the number, then a decimal point, followed by the fractional part of the number; there is always at least one digit on each side of the decimal point. If the format of the number does not match that specified by the variable *read-default-float-format*, then the exponent marker for that format and the digit 0 are also printed. For example, the base of the natural logarithms as a short-format floating-point number might be printed as 2.71828S0. For non-zero magnitudes outside of the range to , a floating-point number will be printed in ``computerized scientific notation.'' The representation of the number is scaled to be between 1 (inclusive) and 10 (exclusive) and then printed, with one digit before the decimal point and at least one digit after the decimal point. Next the exponent marker for the format is printed, except that if the format of the number matches that specified by the variable *read-default-float-format*, then the exponent marker E is used. Finally, the power of 10 by which the fraction must be multiplied to equal the original number is printed as a decimal integer. For example, Avogadro's number as a short-format floating-point number might be printed as 6.02S23. Complex numbers A complex number is printed as #C, an open parenthesis, the printed representation of its real part, a space, the printed representation of its imaginary part, and finally a close parenthesis. [old_change_begin] Characters When *print-escape* is nil, a character prints as itself; it is sent directly to the output stream. When *print-escape* is not nil, then #\ syntax is used. For example, the printed representation of the character #\A with control and meta bits on would be #\CONTROL-META-A, and that of #\a with control and meta bits on would be #\CONTROL-META-\a. [old_change_end] [change_begin] X3J13 voted in June 1989 (DATA-IO) to specify that if *print-readably* is not nil then every object must be printed in a readable form, regardless of other printer control variables. For characters, the simplest approach is always to use #\ syntax when *print-readably* is not nil, regardless of the value of *print-escape*. [change_end] [old_change_begin] Symbols When *print-escape* is nil, only the characters of the print name of the symbol are output (but the case in which to print any uppercase characters in the print name is controlled by the variable *print-case*). [old_change_end] [change_begin] X3J13 voted in June 1989 (READ-CASE-SENSITIVITY) to specify that the new readtable-case slot of the current readtable also controls the case in which letters (whether uppercase or lowercase) in the print name of a symbol are output, no matter what the value of *print-escape*. [change_end] [old_change_begin] The remaining paragraphs describing the printing of symbols cover the situation when *print-escape* is not nil. [old_change_end] [change_begin] X3J13 voted in June 1989 (DATA-IO) to specify that if *print-readably* is not nil then every object must be printed in a readable form, regardless of other printer control variables. For symbols, the simplest approach is to print them, when *print-readably* is not nil, as if *print-escape* were not nil, regardless of the actual value of *print-escape*. [change_end] Backslashes and vertical bars | are included as required. In particular, backslash or vertical-bar syntax is used when the name of the symbol would be otherwise treated by the reader as a potential number (see section 22.1.2). In making this decision, it is assumed that the value of *print-base* being used for printing would be used as the value of *read-base* used for reading; the value of *read-base* at the time of printing is irrelevant. For example, if the value of *print-base* were 16 when printing the symbol face, it would have to be printed as FACE or Face or |FACE|, because the token face would be read as a hexadecimal number (decimal value 64206) if *read-base* were 16. [old_change_begin] The case in which to print any uppercase characters in the print name is controlled by the variable *print-case*. [old_change_end] [change_begin] X3J13 voted in June 1989 (PRINT-CASE-PRINT-ESCAPE-INTERACTION) to clarify the interaction of *print-case* with *print-escape*; see *print-case*. [change_end] As a special case [no pun intended], nil may sometimes be printed as () instead, when *print-escape* and *print-pretty* are both not nil. Package prefixes may be printed (using colon syntax) if necessary. The rules for package qualifiers are as follows. When the symbol is printed, if it is in the keyword package, then it is printed with a preceding colon; otherwise, if it is accessible in the current package, it is printed without any qualification; otherwise, it is printed with qualification. See chapter 11. [old_change_begin] A symbol that is uninterned (has no home package) is printed preceded by #: if the variables *print-gensym* and *print-escape* are both non-nil; if either is nil, then the symbol is printed without a prefix, as if it were in the current package. [old_change_end] [change_begin] X3J13 voted in June 1989 (DATA-IO) to specify that if *print-readably* is not nil then every object must be printed in a readable form, regardless of other printer control variables. For uninterned symbols, the simplest approach is to print them, when *print-readably* is not nil, as if *print-escape* and *print-gensym* were not nil, regardless of their actual values. [change_end] ------------------------------------------------------------------------------- Implementation note: Because the #: syntax does not intern the following symbol, it is necessary to use circular-list syntax if *print-circle* is not nil and the same uninterned symbol appears several times in an expression to be printed. For example, the result of (let ((x (make-symbol "FOO"))) (list x x)) would be printed as (#:foo #:foo) if *print-circle* were nil, but as (#1=#:foo #1#) if *print-circle* were not nil. ------------------------------------------------------------------------------- [old_change_begin] The case in which symbols are to be printed is controlled by the variable *print-case*. [old_change_end] [change_begin] It is also controlled by *print-escape* and the readtable-case slot of the current readtable (the value of *readtable*). [change_end] [old_change_begin] Strings The characters of the string are output in order. If *print-escape* is not nil, a double quote is output before and after, and all double quotes and single escape characters are preceded by backslash. The printing of strings is not affected by *print-array*. If the string has a fill pointer, then only those characters below the fill pointer are printed. [old_change_end] [change_begin] X3J13 voted in June 1989 (DATA-IO) to specify that if *print-readably* is not nil then every object must be printed in a readable form, regardless of other printer control variables. For strings, the simplest approach is to print them, when *print-readably* is not nil, as if *print-escape* were not nil, regardless of the actual value of *print-escape*. [change_end] Conses Wherever possible, list notation is preferred over dot notation. Therefore the following algorithm is used: 1. Print an open parenthesis, (. 2. Print the car of the cons. 3. If the cdr is a cons, make it the current cons, print a space, and go to step 2. 4. If the cdr is not null, print a space, a dot, a space, and the cdr. 5. Print a close parenthesis, ). This form of printing is clearer than showing each individual cons cell. Although the two expressions below are equivalent, and the reader will accept either one and produce the same data structure, the printer will always print such a data structure in the second form. (a . (b . ((c . (d . nil)) . (e . nil)))) (a b (c d) e) [old_change_begin] The printing of conses is affected by the variables *print-level* and *print-length*. [old_change_end] [change_begin] X3J13 voted in June 1989 (DATA-IO) to specify that if *print-readably* is not nil then every object must be printed in a readable form, regardless of other printer control variables. For conses, the simplest approach is to print them, when *print-readably* is not nil, as if *print-level* and *print-length* were nil, regardless of their actual values. [change_end] [old_change_begin] Bit-vectors A bit-vector is printed as #* followed by the bits of the bit-vector in order. If *print-array* is nil, however, then the bit-vector is printed in a format (using #<) that is concise but not readable. If the bit-vector has a fill pointer, then only those bits below the fill pointer are printed. [old_change_end] [change_begin] X3J13 voted in June 1989 (DATA-IO) to specify that if *print-readably* is not nil then every object must be printed in a readable form, regardless of other printer control variables. For bit-vectors, the simplest approach is to print them, when *print-readably* is not nil, as if *print-array* were not nil, regardless of the actual value of *print-array*. [change_end] Vectors Any vector other than a string or bit-vector is printed using general-vector syntax; this means that information about specialized vector representations will be lost. The printed representation of a zero-length vector is #(). The printed representation of a non-zero-length vector begins with #(. Following that, the first element of the vector is printed. If there are any other elements, they are printed in turn, with a space printed before each additional element. A close parenthesis after the last element terminates the printed representation of the vector. [old_change_begin] The printing of vectors is affected by the variables *print-level* and *print-length*. If the vector has a fill pointer, then only those elements below the fill pointer are printed. If *print-array* is nil, however, then the vector is not printed as described above, but in a format (using #<) that is concise but not readable. [old_change_end] [change_begin] X3J13 voted in June 1989 (DATA-IO) to specify that if *print-readably* is not nil then every object must be printed in a readable form, regardless of other printer control variables. For vectors, the simplest approach is to print them, when *print-readably* is not nil, as if *print-level* and *print-length* were nil and *print-array* were not nil, regardless of their actual values. [change_end] Arrays Normally any array other than a vector is printed using #nA format. Let n be the rank of the array. Then # is printed, then n as a decimal integer, then A, then n open parentheses. Next the elements are scanned in row-major order. Imagine the array indices being enumerated in odometer fashion, recalling that the dimensions are numbered from 0 to n-1. Every time the index for dimension j is incremented, the following actions are taken: 1. If j<n-1, then print a close parenthesis. 2. If incrementing the index for dimension j caused it to equal dimension j, reset that index to zero and increment dimension j-1 (thereby performing these three steps recursively), unless j=0, in which case simply terminate the entire algorithm. If incrementing the index for dimension j did not cause it to equal dimension j, then print a space. 3. If j<n-1, then print an open parenthesis. This causes the contents to be printed in a format suitable for use as the :initial-contents argument to make-array. [old_change_begin] The lists effectively printed by this procedure are subject to truncation by *print-level* and *print-length*. [old_change_end] If the array is of a specialized type, containing bits or string-characters, then the innermost lists generated by the algorithm given above may instead be printed using bit-vector or string syntax, provided that these innermost lists would not be subject to truncation by *print-length*. For example, a 3-by-2-by-4 array of string-characters that would ordinarily be printed as #3A(((#\s #\t #\o #\p) (#\s #\p #\o #\t)) ((#\p #\o #\s #\t) (#\p #\o #\t #\s)) ((#\t #\o #\p #\s) (#\o #\p #\t #\s))) may instead be printed more concisely as #3A(("stop" "spot") ("post" "pots") ("tops" "opts")) [old_change_begin] If *print-array* is nil, then the array is printed in a format (using #<) that is concise but not readable. [old_change_end] [change_begin] X3J13 voted in June 1989 (DATA-IO) to specify that if *print-readably* is not nil then every object must be printed in a readable form, regardless of other printer control variables. For arrays, the simplest approach is to print them, when *print-readably* is not nil, as if *print-level* and *print-length* were nil and *print-array* were not nil, regardless of their actual values. [change_end] Random-states Common Lisp does not specify a specific syntax for printing objects of type random-state. However, every implementation must arrange to print a random-state object in such a way that, within the same implementation of Common Lisp, the function read can construct from the printed representation a copy of the random-state object as if the copy had been made by make-random-state. [old_change_begin] Pathnames Common Lisp does not specify a specific syntax for printing objects of type pathname. However, every implementation must arrange to print a pathname in such a way that, within the same implementation of Common Lisp, the function read can construct from the printed representation an equivalent instance of the pathname object. [old_change_end] [change_begin] X3J13 voted in June 1989 (PATHNAME-PRINT-READ) to specify that if *print-escape* is true, a pathname should be printed by write as #P"..." where "..." is the namestring representation of the pathname. If *print-escape* is false, write prints a pathname by printing its namestring (presumably without escape characters or surrounding double quotes). X3J13 voted in June 1989 (DATA-IO) to specify that if *print-readably* is not nil then every object must be printed in a readable form, regardless of other printer control variables. For pathnames, the simplest approach is to print them, when *print-readably* is not nil, as if *print-escape* were nil, regardless of its actual value. [change_end] Structures defined by defstruct are printed under the control of the user-specified :print-function option to defstruct. If the user does not provide a printing function explicitly, then a default printing function is supplied that prints the structure using #S syntax (see section 22.1.4). [old_change_begin] Any other types are printed in an implementation-dependent manner. It is recommended that printed representations of all such objects begin with the characters #< and end with > so that the reader will catch such objects and not permit them to be read under normal circumstances. It is specifically and purposely not required that a Common Lisp implementation be able to print an object of type hash-table, readtable, package, stream, or function in a way that can be read back in successfully by read; the use of #< syntax is especially recommended for the printing of such objects. [old_change_end] [change_begin] X3J13 voted in June 1989 (DATA-IO) to specify that if *print-readably* is not nil then every object must be printed in a readable form, regardless of the values of other printer control variables; if this is not possible, then an error of type print-not-readable must be signaled to avoid printing an unreadable syntax such as #<...>. X3J13 voted in June 1989 (DATA-IO) to add print-unreadable-object, a macro that prints an object using #<...> syntax and also takes care of checking the variable *print-readably*. [change_end] When debugging or when frequently dealing with large or deep objects at top level, the user may wish to restrict the printer from printing large amounts of information. The variables *print-level* and *print-length* allow the user to control how deep the printer will print and how many elements at a given level the printer will print. Thus the user can see enough of the object to identify it without having to wade through the entire expression. [change_begin] [Variable] *print-readably* The default value of *print-readably* is nil. If *print-readably* is true, then printing any object must either produce a printed representation that the reader will accept or signal an error. If printing is successful, the reader will, on reading the printed representation, produce an object that is ``similar as a constant'' (see section 25.1.4) to the object that was printed. If *print-readably* is true and printing a readable printed representation is not possible, the printer signals an error of type print-not-readable rather than using an unreadable syntax such as #<. The printed representation produced when *print-readably* is true might or might not be the same as the printed representation produced when *print-readably* is false. If *print-readably* is true and another printer control variable (such as *print-length*, *print-level*, *print-escape*, *print-gensym*, *print-array*, or an implementation-defined printer control variable) would cause the preceding requirements to be violated, that other printer control variable is ignored. The printing of interned symbols is not affected by *print-readably*. Note that the ``similar as a constant'' rule for readable printing implies that #A or #( syntax cannot be used for arrays of element-type other than t. An implementation will have to use another syntax or signal a print-not-readable error. A print-not-readable error will not be signaled for strings or bit-vectors. All methods for print-object must obey *print-readably*. This rule applies to both user-defined methods and implementation-defined methods. The reader control variable *read-eval* also affects printing. If *read-eval* is false and *print-readably* is true, any print-object method that would otherwise output a #. reader macro must either output something different or signal an error of type print-not-readable. Readable printing of structures and objects of type standard-object is controlled by their print-object methods, not by their make-load-form methods. ``Similarity as a constant'' for these objects is application-dependent and hence is defined to be whatever these methods do. *print-readably* allows errors involving data with no readable printed representation to be detected when writing the file rather than later on when the file is read. *print-readably* is more rigorous than *print-escape*; output printed with escapes must be merely generally recognizable by humans, with a good chance of being recognizable by computers, whereas output printed readably must be reliably recognizable by computers. [change_end] [Variable] *print-escape* When this flag is nil, then escape characters are not output when an expression is printed. In particular, a symbol is printed by simply printing the characters of its print name. The function princ effectively binds *print-escape* to nil. When this flag is not nil, then an attempt is made to print an expression in such a way that it can be read again to produce an equal structure. The function prin1 effectively binds *print-escape* to t. The initial value of this variable is t. ------------------------------------------------------------------------------- Compatibility note: *print-escape* controls what was called slashification in MacLisp. ------------------------------------------------------------------------------- [Variable] *print-pretty* When this flag is nil, then only a small amount of whitespace is output when printing an expression. When this flag is not nil, then the printer will endeavor to insert extra whitespace where appropriate to make the expression more readable. A few other simple changes may be made, such as printing 'foo instead of (quote foo). The initial value of *print-pretty* is implementation-dependent. [change_begin] X3J13 voted in January 1989 (PRETTY-PRINT-INTERFACE) to adopt a facility for user-controlled pretty printing in Common Lisp (see chapter 27). [change_end] [Variable] *print-circle* When this flag is nil (the default), then the printing process proceeds by recursive descent; an attempt to print a circular structure may lead to looping behavior and failure to terminate. [old_change_begin] When this flag is not nil, then the printer will endeavor to detect cycles in the structure to be printed, and to use #n= and #n# syntax to indicate the circularities. [old_change_end] [change_begin] X3J13 voted in June 1989 (PRINT-CIRCLE-SHARED) to specify that if *print-circle* is true, the printer is required to detect not only cycles but shared substructure, indicating both through the use of #n= and #n# syntax. As an example, under the specification of the first edition (print '(#1=(a #1#) #1#)) might legitimately print (#1=(A #1#) #1#) or (#1=(A #1#) #2=(A #2#)); the vote specifies that the first form is required. X3J13 voted in January 1989 (PRINT-CIRCLE-STRUCTURE) to specify that user-defined printing functions for the defstruct :print-function option, as well as user-defined methods for the CLOS generic function print-object, may print objects to the supplied stream using write, print1, princ, format, or print-object and expect circularities to be detected and printed using #n# syntax (when *print-circle* is non-nil, of course). It seems to me that the same ought to apply to abbreviation as controlled by *print-level* and *print-length*, but that was not addressed by this vote. [change_end] [Variable] *print-base* The value of *print-base* determines in what radix the printer will print rationals. This may be any integer from 2 to 36, inclusive; the default value is 10 (decimal radix). For radices above 10, letters of the alphabet are used to represent digits above 9. ------------------------------------------------------------------------------- Compatibility note:MacLisp calls this variable base, and its default value is 8, not 10. In both MacLisp and Common Lisp, floating-point numbers are always printed in decimal, no matter what the value of *print-base*. ------------------------------------------------------------------------------- [Variable] *print-radix* If the variable *print-radix* is non-nil, the printer will print a radix specifier to indicate the radix in which it is printing a rational number. To prevent confusion of the letter O with the digit 0, and of the letter B with the digit 8, the radix specifier is always printed using lowercase letters. For example, if the current base is twenty-four (decimal), the decimal integer twenty-three would print as #24rN. If *print-base* is 2, 8, or 16, then the radix specifier used is #b, #o, or #x. For integers, base ten is indicated by a trailing decimal point instead of a leading radix specifier; for ratios, however, #10r is used. The default value of *print-radix* is nil. [Variable] *print-case* The read function normally converts lowercase characters appearing in symbols to corresponding uppercase characters, so that internally print names normally contain only uppercase characters. However, users may prefer to see output using lowercase letters or letters of mixed case. This variable controls the case (upper, lower, or mixed) in which to print any uppercase characters in the names of symbols when vertical-bar syntax is not used. The value of *print-case* should be one of the keywords :upcase, :downcase, or :capitalize; the initial value is :upcase. Lowercase characters in the internal print name are always printed in lowercase, and are preceded by a single escape character or enclosed by multiple escape characters. Uppercase characters in the internal print name are printed in uppercase, in lowercase, or in mixed case so as to capitalize words, according to the value of *print-case*. The convention for what constitutes a ``word'' is the same as for the function string-capitalize. [change_begin] X3J13 voted in June 1989 (PRINT-CASE-PRINT-ESCAPE-INTERACTION) to clarify the interaction of *print-case* with *print-escape*. When *print-escape* is nil, *print-case* determines the case in which to print all uppercase characters in the print name of the symbol. When *print-escape* is not nil, the implementation has some freedom as to which characters will be printed so as to appear in an ``escape context'' (after an escape character, typically , or between multiple escape characters, typically |); *print-case* determines the case in which to print all uppercase characters that will not appear in an escape context. For example, when the value of *print-case* is :upcase, an implementation might choose to print the symbol whose print name is "(S)HE" as (S )HE or as |(S)HE|, among other possibilities. When the value of *print-case* is :downcase, the corresponding output should be (s )he or |(S)HE|, respectively. Consider the following test code. (For the sake of this example assume that readtable-case is :upcase in the current readtable; this is discussed further below.) (let ((tabwidth 11)) (dolist (sym '(|x| |FoObAr| |fOo|)) (let ((tabstop -1)) (format t "~&") (dolist (escape '(t nil)) (dolist (case '(:upcase :downcase :capitalize)) (format t "~VT" (* (incf tabstop) tabwidth)) (write sym :escape escape :case case))))) (format t "~%")) An implementation that leans heavily on multiple-escape characters (vertical bars) might produce the following output: |x| |x| |x| x x x |FoObAr| |FoObAr| |FoObAr| FoObAr foobar Foobar |fOo| |fOo| |fOo| fOo foo foo An implementation that leans heavily on single-escape characters (backslashes) might produce the following output: \x \x \x x x x F\oO\bA\r f\oo\ba\r F\oo\ba\r FoObAr foobar Foobar \fO\o \fo\o \fo\o fOo foo foo These examples are not exhaustive; output using both kinds of escape characters (for example, |FoO|\bA\r) is permissible (though ugly). X3J13 voted in June 1989 (READ-CASE-SENSITIVITY) to add a new readtable-case slot to readtables to control automatic case conversion during the reading of symbols. The value of readtable-case in the current readtable also affects the printing of unescaped letters (letters appearing in an escape context are always printed in their own case). * If readtable-case is :upcase, unescaped uppercase letters are printed in the case specified by *print-case* and unescaped lowercase letters are printed in their own case. (If *print-escape* is non-nil, all lowercase letters will necessarily be escaped.) * If readtable-case is :downcase, unescaped lowercase letters are printed in the case specified by *print-case* and unescaped uppercase letters are printed in their own case. (If *print-escape* is non-nil, all uppercase letters will necessarily be escaped.) * If readtable-case is :preserve, all unescaped letters are printed in their own case, regardless of the value of *print-case*. There is no need to escape any letters, even if *print-escape* is non-nil, though the X3J13 vote did not prohibit escaping letters in this situation. * If readtable-case is :invert, and if all unescaped letters are of the same case, then the case of all the unescaped letters is inverted; but if the unescaped letters are not all of the same case then each is printed in its own case. (Thus :invert does not always invert the case; the inversion is conditional.) There is no need to escape any letters, even if *print-escape* is non-nil, though the X3J13 vote did not prohibit escaping letters in this situation. Consider the following code. ;;; Generate a table illustrating READTABLE-CASE and *PRINT-CASE*. (let ((*readtable* (copy-readtable nil)) (*print-case* *print-case*)) (format t "READTABLE-CASE *PRINT-CASE* Symbol-name Output~ ~%------------------------------------------------~ ~%") (dolist (readtable-case '(:upcase :downcase :preserve :invert)) (setf (readtable-case *readtable*) readtable-case) (dolist (print-case '(:upcase :downcase :capitalize)) (dolist (sym '(|ZEBRA| |Zebra| |zebra|)) (setq *print-case* print-case) (format t ":~A~15T:~A~29T~A~42T~A~%" (string-upcase readtable-case) (string-upcase print-case) (symbol-name sym) (prin1-to-string sym))))))) Note that the call to prin1-to-string (the last argument in the call to format that is within the nested loops) effectively uses a non-nil value for *print-escape*. Assuming an implementation that uses vertical bars around a symbol name if any characters need escaping, the output from this test code should be READTABLE-CASE *PRINT-CASE* Symbol-name Output ------------------------------------------------ :UPCASE :UPCASE ZEBRA ZEBRA :UPCASE :UPCASE Zebra |Zebra| :UPCASE :UPCASE zebra |zebra| :UPCASE :DOWNCASE ZEBRA zebra :UPCASE :DOWNCASE Zebra |Zebra| :UPCASE :DOWNCASE zebra |zebra| :UPCASE :CAPITALIZE ZEBRA Zebra :UPCASE :CAPITALIZE Zebra |Zebra| :UPCASE :CAPITALIZE zebra |zebra| :DOWNCASE :UPCASE ZEBRA |ZEBRA| :DOWNCASE :UPCASE Zebra |Zebra| :DOWNCASE :UPCASE zebra ZEBRA :DOWNCASE :DOWNCASE ZEBRA |ZEBRA| :DOWNCASE :DOWNCASE Zebra |Zebra| :DOWNCASE :DOWNCASE zebra zebra :DOWNCASE :CAPITALIZE ZEBRA |ZEBRA| :DOWNCASE :CAPITALIZE Zebra |Zebra| :DOWNCASE :CAPITALIZE zebra Zebra :PRESERVE :UPCASE ZEBRA ZEBRA :PRESERVE :UPCASE Zebra Zebra :PRESERVE :UPCASE zebra zebra :PRESERVE :DOWNCASE ZEBRA ZEBRA :PRESERVE :DOWNCASE Zebra Zebra :PRESERVE :DOWNCASE zebra zebra :PRESERVE :CAPITALIZE ZEBRA ZEBRA :PRESERVE :CAPITALIZE Zebra Zebra :PRESERVE :CAPITALIZE zebra zebra :INVERT :UPCASE ZEBRA zebra :INVERT :UPCASE Zebra Zebra :INVERT :UPCASE zebra ZEBRA :INVERT :DOWNCASE ZEBRA zebra :INVERT :DOWNCASE Zebra Zebra :INVERT :DOWNCASE zebra ZEBRA :INVERT :CAPITALIZE ZEBRA zebra :INVERT :CAPITALIZE Zebra Zebra :INVERT :CAPITALIZE zebra ZEBRA This illustrates all combinations for readtable-case and *print-case*. [change_end] [Variable] *print-gensym* The *print-gensym* variable controls whether the prefix #: is printed before symbols that have no home package. The prefix is printed if the variable is not nil. The initial value of *print-gensym* is t. [Variable] *print-level* *print-length* The *print-level* variable controls how many levels deep a nested data object will print. If *print-level* is nil (the initial value), then no control is exercised. Otherwise, the value should be an integer, indicating the maximum level to be printed. An object to be printed is at level 0; its components (as of a list or vector) are at level 1; and so on. If an object to be recursively printed has components and is at a level equal to or greater than the value of *print-level*, then the object is printed as simply #. The *print-length* variable controls how many elements at a given level are printed. A value of nil (the initial value) indicates that there be no limit to the number of components printed. Otherwise, the value of *print-length* should be an integer. Should the number of elements of a data object exceed the value *print-length*, the printer will print three dots, ..., in place of those elements beyond the number specified by *print-length*. (In the case of a dotted list, if the list contains exactly as many elements as the value of *print-length*, and in addition has the non-null atom terminating it, that terminating atom is printed rather than the three dots.) *print-level* and *print-length* affect the printing not only of lists but also of vectors, arrays, and any other object printed with a list-like syntax. They do not affect the printing of symbols, strings, and bit-vectors. The Lisp reader will normally signal an error when reading an expression that has been abbreviated because of level or length limits. This signal is given because the # dispatch character normally signals an error when followed by whitespace or ), and because ... is defined to be an illegal token, as are all tokens consisting entirely of periods (other than the single dot used in dot notation). As an example, table 22-6 shows the ways the object (if (member x y) (+ (car x) 3) '(foo . #(a b c d "Baz"))) would be printed for various values of *print-level* (in the column labeled v) and *print-length* (in the column labeled n). ---------------------------------------------------------------- Table 22-6: Examples of Print Level and Print Length Abbreviation v n Output ====================================================== 0 1 # 1 1 (if ...) 1 2 (if # ...) 1 3 (if # # ...) 1 4 (if # # #) 2 1 (if ...) 2 2 (if (member x ...) ...) 2 3 (if (member x y) (+ # 3) ...) 3 2 (if (member x ...) ...) 3 3 (if (member x y) (+ (car x) 3) ...) 3 4 (if (member x y) (+ (car x) 3) '(foo . #(a b c d ...))) 3 5 (if (member x y) (+ (car x) 3) '(foo . #(a b c d "Baz"))) ====================================================== ---------------------------------------------------------------- [Variable] *print-array* If *print-array* is nil, then the contents of arrays other than strings are never printed. Instead, arrays are printed in a concise form (using #<) that gives enough information for the user to be able to identify the array but does not include the entire array contents. If *print-array* is not nil, non-string arrays are printed using #(, #*, or #nA syntax. [change_begin] Notice of correction. In the first edition, the preceding paragraph mentioned the nonexistent variable print-array instead of *print-array*. [change_end] The initial value of *print-array* is implementation-dependent. [change_begin] [Macro] with-standard-io-syntax {declaration}* {form}* X3J13 voted in June 1989 (DATA-IO) to add the macro with-standard-io-syntax. Within the dynamic extent of the body, all reader/printer controlvariables, including any implementation-defined ones not specified byCommon Lisp, are bound to values that produce standard read/printbehavior. Table 22-7 shows the values to which standard Common Lisp variables are bound. The values returned by with-standard-io-syntax are the values of the last body form, or nil if there are no body forms. The intent is that a pair of executions, as shown in the following example, should provide reasonable reliable communication of data from one Lisp process to another: ;;; Write DATA to a file. (with-open-file (file pathname :direction :output) (with-standard-io-syntax (print data file))) ;;; ... Later, in another Lisp: (with-open-file (file pathname :direction :input) (with-standard-io-syntax (setq data (read file)))) Using with-standard-io-syntax to bind all the variables, instead of using let and explicit bindings, ensures that nothing is overlooked and avoids problems with implementation-defined reader/printer control variables. If the user wishes to use a non-standard value for some variable, such as *package* or *read-eval*, it can be bound by let inside the body of with-standard-io-syntax. For example: ;;; Write DATA to a file. Forbid use of #. syntax. (with-open-file (file pathname :direction :output) (let ((*read-eval* nil)) (with-standard-io-syntax (print data file)))) ;;; Read DATA from a file. Forbid use of #. syntax. (with-open-file (file pathname :direction :input) (let ((*read-eval* nil)) (with-standard-io-syntax (setq data (read file))))) Similarly, a user who dislikes the arbitrary choice of values for *print-circle* and *print-pretty* can bind these variables to other values inside the body. The X3J13 vote left it unclear whether with-standard-io-syntax permits declarations to appear before the body of the macro call. I believe that was the intent, and this is reflected in the syntax shown above; but this is only my interpretation. ---------------------------------------------------------------- Table 22-7: Standard Bindings for I/O Control Variables Variable Value =========================================================== *package* the common-lisp-user package *print-array* t *print-base* 10 *print-case* :upcase *print-circle* nil *print-escape* t *print-gensym* t *print-length* nil *print-level* nil *print-lines* nil * *print-miser-width* nil * *print-pprint-dispatch* nil * *print-pretty* nil *print-radix* nil *print-readably* t *print-right-margin* nil * *read-base* 10 *read-default-float-format* single-float *read-eval* t *read-suppress* nil *readtable* the standard readtable * X3J13 voted in June 1989 (PRETTY-PRINT-INTERFACE) to introduce the printer control variables *print-right-margin*, *print-miser-width*, *print-lines*, and *print-pprint-dispatch* (see section 27.2) but did not specify the values to which with-standard-io-syntax should bind them. I recommend that all four should be bound to nil. ---------------------------------------------------------------- [change_end] ------------------------------------------------------------------------------- 22.2. Input Functions The input functions are divided into two groups: those that operate on streams of characters and those that operate on streams of binary data. ------------------------------------------------------------------------------- * Input from Character Streams * Input from Binary Streams ------------------------------------------------------------------------------- 22.2.1. Input from Character Streams Many character input functions take optional arguments called input-stream, eof-error-p, and eof-value. The input-stream argument is the stream from which to obtain input; if unsupplied or nil it defaults to the value of the special variable *standard-input*. One may also specify t as a stream, meaning the value of the special variable *terminal-io*. The eof-error-p argument controls what happens if input is from a file (or any other input source that has a definite end) and the end of the file is reached. If eof-error-p is true (the default), an error will be signaled at end of file. If it is false, then no error is signaled, and instead the function returns eof-value. [change_begin] X3J13 voted in January 1989 (ARGUMENTS-UNDERSPECIFIED) to clarify that an eof-value argument may be any Lisp datum whatsoever. [change_end] Functions such as read that read the representation of an object rather than a single character will always signal an error, regardless of eof-error-p, if the file ends in the middle of an object representation. For example, if a file does not contain enough right parentheses to balance the left parentheses in it, read will complain. If a file ends in a symbol or a number immediately followed by end-of-file, read will read the symbol or number successfully and when called again will see the end-of-file and only then act according to eof-error-p. Similarly, the function read-line will successfully read the last line of a file even if that line is terminated by end-of-file rather than the newline character. If a file contains ignorable text at the end, such as blank lines and comments, read will not consider it to end in the middle of an object. Thus an eof-error-p argument controls what happens when the file ends between objects. Many input functions also take an argument called recursive-p. If specified and not nil, this argument specifies that this call is not a ``top-level'' call to read but an imbedded call, typically from the function for a macro character. It is important to distinguish such recursive calls for three reasons. First, a top-level call establishes the context within which the #n= and #n# syntax is scoped. Consider, for example, the expression (cons '#3=(p q r) '(x y . #3#)) If the single-quote macro character were defined in this way: (set-macro-character #\' #'(lambda (stream char) (declare (ignore char)) (list 'quote (read stream)))) then the expression could not be read properly, because there would be no way to know when read is called recursively by the first occurrence of ' that the label #3= would be referred to later in the containing expression. There would be no way to know because read could not determine that it was called by a macro-character function rather than from ``top level.'' The correct way to define the single quote macro character uses the recursive-p argument: (set-macro-character #\' #'(lambda (stream char) (declare (ignore char)) (list 'quote (read stream t nil t)))) Second, a recursive call does not alter whether the reading process is to preserve whitespace or not (as determined by whether the top-level call was to read or read-preserving-whitespace). Suppose again that the single quote had the first, incorrect, macro-character definition shown above. Then a call to read-preserving-whitespace that read the expression 'foo would fail to preserve the space character following the symbol foo because the single-quote macro-character function calls read, not read-preserving-whitespace, to read the following expression (in this case foo). The correct definition, which passes the value t for the recursive-p argument to read, allows the top-level call to determine whether whitespace is preserved. Third, when end-of-file is encountered and the eof-error-p argument is not nil, the kind of error that is signaled may depend on the value of recursive-p. If recursive-p is not nil, then the end-of-file is deemed to have occurred within the middle of a printed representation; if recursive-p is nil, then the end-of-file may be deemed to have occurred between objects rather than within the middle of one. [Function] read &optional input-stream eof-error-p eof-value recursive-p read reads in the printed representation of a Lisp object from input-stream, builds a corresponding Lisp object, and returns the object. Note that when the variable *read-suppress* is not nil, then read reads in a printed representation as best it can, but most of the work of interpreting the representation is avoided (the intent being that the result is to be discarded anyway). For example, all extended tokens produce the result nil regardless of their syntax. [Variable] *read-default-float-format* The value of this variable must be a type specifier symbol for a specific floating-point format; these include short-float, single-float, double-float, and long-float and may include implementation-specific types as well. The default value is single-float. *read-default-float-format* indicates the floating-point format to be used for reading floating-point numbers that have no exponent marker or have e or E for an exponent marker. (Other exponent markers explicitly prescribe the floating-point format to be used.) The printer also uses this variable to guide the choice of exponent markers when printing floating-point numbers. [Function] read-preserving-whitespace &optional in-stream eof-error-p eof-value recursive-p Certain printed representations given to read, notably those of symbols and numbers, require a delimiting character after them. (Lists do not, because the close parenthesis marks the end of the list.) Normally read will throw away the delimiting character if it is a whitespace character; but read will preserve the character (using unread-char) if it is syntactically meaningful, because it may be the start of the next expression. [change_begin] X3J13 voted in January 1989 (PEEK-CHAR-READ-CHAR-ECHO) to clarify the interaction of unread-char with echo streams. These changes indirectly affect the echoing behavior of read-preserving-whitespace. [change_end] The function read-preserving-whitespace is provided for some specialized situations where it is desirable to determine precisely what character terminated the extended token. As an example, consider this macro-character definition: (defun slash-reader (stream char) (declare (ignore char)) (do ((path (list (read-preserving-whitespace stream)) (cons (progn (read-char stream nil nil t) (read-preserving-whitespace stream)) path))) ((not (char= (peek-char nil stream nil nil t) #\/)) (cons 'path (nreverse path))))) (set-macro-character #\/ #'slash-reader) (This is actually a rather dangerous definition to make because expressions such as (/ x 3) will no longer be read properly. The ability to reprogram the reader syntax is very powerful and must be used with caution. This redefinition of / is shown here purely for the sake of example.) Consider now calling read on this expression: (zyedh /usr/games/zork /usr/games/boggle) The / macro reads objects separated by more / characters; thus /usr/games/zork is intended to be read as (path usr games zork). The entire example expression should therefore be read as (zyedh (path usr games zork) (path usr games boggle)) However, if read had been used instead of read-preserving-whitespace, then after the reading of the symbol zork, the following space would be discarded; the next call to peek-char would see the following /, and the loop would continue, producing this interpretation: (zyedh (path usr games zork usr games boggle)) On the other hand, there are times when whitespace should be discarded. If a command interpreter takes single-character commands, but occasionally reads a Lisp object, then if the whitespace after a symbol is not discarded it might be interpreted as a command some time later after the symbol had been read. Note that read-preserving-whitespace behaves exactly like read when the recursive-p argument is not nil. The distinction is established only by calls with recursive-p equal to nil or omitted. [Function] read-delimited-list char &optional input-stream recursive-p This reads objects from stream until the next character after an object's representation (ignoring whitespace characters and comments) is char. (The char should not have whitespace syntax in the current readtable.) A list of the objects read is returned. To be more precise, read-delimited-list looks ahead at each step for the next non-whitespace character and peeks at it as if with peek-char. If it is char, then the character is consumed and the list of objects is returned. If it is a constituent or escape character, then read is used to read an object, which is added to the end of the list. If it is a macro character, the associated macro function is called; if the function returns a value, that value is added to the list. The peek-ahead process is then repeated. [change_begin] X3J13 voted in January 1989 (PEEK-CHAR-READ-CHAR-ECHO) to clarify the interaction of peek-char with echo streams. These changes indirectly affect the echoing behavior of the function read-delimited-list. [change_end] This function is particularly useful for defining new macro characters. Usually it is desirable for the terminating character char to be a terminating macro character so that it may be used to delimit tokens; however, read-delimited-list makes no attempt to alter the syntax specified for char by the current readtable. The user must make any necessary changes to the readtable syntax explicitly. The following example illustrates this. Suppose you wanted #{a b c ... z} to be read as a list of all pairs of the elements a, b, c, ..., z; for example: #{p q z a} reads as ((p q) (p z) (p a) (q z) (q a) (z a)) This can be done by specifying a macro-character definition for #{ that does two things: read in all the items up to the }, and construct the pairs. read-delimited-list performs the first task. [change_begin] Note that mapcon allows the mapped function to examine the items of the list after the current one, and that mapcon uses nconc, which is all right because mapcar will produce fresh lists. [change_end] (defun |#{-reader| (stream char arg) (declare (ignore char arg)) (mapcon #'(lambda (x) (mapcar #'(lambda (y) (list (car x) y)) (cdr x))) (read-delimited-list #\} stream t))) (set-dispatch-macro-character #\# #\{ #'|#{-reader|) (set-macro-character #\} (get-macro-character #\) nil)) (Note that t is specified for the recursive-p argument.) It is necessary here to give a definition to the character } as well to prevent it from being a constituent. If the line (set-macro-character #\} (get-macro-character #\) nil)) shown above were not included, then the } in #{p q z a} would be considered a constituent character, part of the symbol named a}. One could correct for this by putting a space before the }, but it is better simply to use the call to set-macro-character. Giving } the same definition as the standard definition of the character ) has the twin benefit of making it terminate tokens for use with read-delimited-list and also making it illegal for use in any other context (that is, attempting to read a stray } will signal an error). Note that read-delimited-list does not take an eof-error-p (or eof-value) argument. The reason is that it is always an error to hit end-of-file during the operation of read-delimited-list. [Function] read-line &optional input-stream eof-error-p eof-value recursive-p read-line reads in a line of text terminated by a newline. It returns the line as a character string (without the newline character). This function is usually used to get a line of input from the user. A second returned value is a flag that is false if the line was terminated normally, or true if end-of-file terminated the (non-empty) line. If end-of-file is encountered immediately (that is, appears to terminate an empty line), then end-of-file processing is controlled in the usual way by the eof-error-p, eof-value, and recursive-p arguments. The corresponding output function is write-line. [Function] read-char &optional input-stream eof-error-p eof-value recursive-p read-char inputs one character from input-stream and returns it as a character object. The corresponding output function is write-char. [change_begin] X3J13 voted in January 1989 (PEEK-CHAR-READ-CHAR-ECHO) to clarify the interaction of read-char with echo streams (as created by make-echo-stream). A character is echoed from the input stream to the associated output stream the first time it is seen. If a character is read again because of an intervening unread-char operation, the character is not echoed again when read for the second time or any subsequent time. [change_end] [Function] unread-char character &optional input-stream unread-char puts the character onto the front of input-stream. The character must be the same character that was most recently read from the input-stream. The input-stream ``backs up'' over this character; when a character is next read from input-stream, it will be the specified character followed by the previous contents of input-stream. unread-char returns nil. One may apply unread-char only to the character most recently read from input-stream. Moreover, one may not invoke unread-char twice consecutively without an intervening read-char operation. The result is that one may back up only by one character, and one may not insert any characters into the input stream that were not already there. [change_begin] X3J13 voted in January 1989 (UNREAD-CHAR-AFTER-PEEK-CHAR) to clarify that one also may not invoke unread-char after invoking peek-char without an intervening read-char operation. This is consistent with the notion that peek-char behaves much like read-char followed by unread-char. [change_end] ------------------------------------------------------------------------------- Rationale: This is not intended to be a general mechanism, but rather an efficient mechanism for allowing the Lisp reader and other parsers to perform one-character lookahead in the input stream. This protocol admits a wide variety of efficient implementations, such as simply decrementing a buffer pointer. To have to specify the character in the call to unread-char is admittedly redundant, since at any given time there is only one character that may be legally specified. The redundancy is intentional, again to give the implementation latitude. ------------------------------------------------------------------------------- [change_begin] X3J13 voted in January 1989 (PEEK-CHAR-READ-CHAR-ECHO) to clarify the interaction of unread-char with echo streams (as created by make-echo-stream). When a character is ``unread'' from an echo stream, no attempt is made to ``unecho'' the character. However, a character placed back into an echo stream by unread-char will not be re-echoed when it is subsequently re-read by read-char. [change_end] [Function] peek-char &optional peek-type input-stream eof-error-p eof-value recursive-p What peek-char does depends on the peek-type, which defaults to nil. With a peek-type of nil, peek-char returns the next character to be read from input-stream, without actually removing it from the input stream. The next time input is done from input-stream, the character will still be there. It is as if one had called read-char and then unread-char in succession. If peek-type is t, then peek-char skips over whitespace characters (but not comments) and then performs the peeking operation on the next character. This is useful for finding the (possible) beginning of the next printed representation of a Lisp object. The last character examined (the one that starts an object) is not removed from the input stream. If peek-type is a character object, then peek-char skips over input characters until a character that is char= to that object is found; that character is left in the input stream. [change_begin] X3J13 voted in January 1989 (PEEK-CHAR-READ-CHAR-ECHO) to clarify the interaction of peek-char with echo streams (as created by make-echo-stream). When a character from an echo stream is only peeked at, it is not echoed at that time. The character remains in the input stream and may be echoed when read by read-char at a later time. Note, however, that if the peek-type is not nil, then characters skipped over (and therefore consumed) by peek-char are treated as if they had been read by read-char, and will be echoed if read-char would have echoed them. [change_end] [Function] listen &optional input-stream The predicate listen is true if there is a character immediately available from input-stream, and is false if not. This is particularly useful when the stream obtains characters from an interactive device such as a keyboard. A call to read-char would simply wait until a character was available, but listen can sense whether or not input is available and allow the program to decide whether or not to attempt input. On a non-interactive stream, the general rule is that listen is true except when at end-of-file. [Function] read-char-no-hang &optional input-stream eof-error-p eof-value recursive-p This function is exactly like read-char, except that if it would be necessary to wait in order to get a character (as from a keyboard), nil is immediately returned without waiting. This allows one to efficiently check for input availability and get the input if it is available. This is different from the listen operation in two ways. First, read-char-no-hang potentially reads a character, whereas listen never inputs a character. Second, listen does not distinguish between end-of-file and no input being available, whereas read-char-no-hang does make that distinction, returning eof-value at end-of-file (or signaling an error if no eof-error-p is true) but always returning nil if no input is available. [Function] clear-input &optional input-stream This clears any buffered input associated with input-stream. It is primarily useful for clearing type-ahead from keyboards when some kind of asynchronous error has occurred. If this operation doesn't make sense for the stream involved, then clear-input does nothing. clear-input returns nil. [Function] read-from-string string &optional eof-error-p eof-value &key :start :end :preserve-whitespace The characters of string are given successively to the Lisp reader, and the Lisp object built by the reader is returned. Macro characters and so on will all take effect. The arguments :start and :end delimit a substring of string beginning at the character indexed by :start and up to but not including the character indexed by :end. By default :start is 0 (the beginning of the string) and :end is (length string). This is the same as for other string functions. The flag :preserve-whitespace, if provided and not nil, indicates that the operation should preserve whitespace as for read-preserving-whitespace. It defaults to nil. As with other reading functions, the arguments eof-error-p and eof-value control the action if the end of the (sub)string is reached before the operation is completed; reaching the end of the string is treated as any other end-of-file event. read-from-string returns two values: the first is the object read, and the second is the index of the first character in the string not read. If the entire string was read, the second result will be either the length of the string or one greater than the length of the string. The parameter :preserve-whitespace may affect this second value. (read-from-string "(a b c)") => (a b c) and 7 [Function] parse-integer string &key :start :end :radix :junk-allowed This function examines the substring of string delimited by :start and :end (which default to the beginning and end of the string). It skips over whitespace characters and then attempts to parse an integer. The :radix parameter defaults to 10 and must be an integer between 2 and 36. If :junk-allowed is not nil, then the first value returned is the value of the number parsed as an integer or nil if no syntactically correct integer was seen. If :junk-allowed is nil (the default), then the entire substring is scanned. The returned value is the value of the number parsed as an integer. An error is signaled if the substring does not consist entirely of the representation of an integer, possibly surrounded on either side by whitespace characters. In either case, the second value is the index into the string of the delimiter that terminated the parse, or it is the index beyond the substring if the parse terminated at the end of the substring (as will always be the case if :junk-allowed is false). Note that parse-integer does not recognize the syntactic radix-specifier prefixes #O, #B, #X, and #nR, nor does it recognize a trailing decimal point. It permits only an optional sign (+ or -) followed by a non-empty sequence of digits in the specified radix. ------------------------------------------------------------------------------- 22.2.2. Input from Binary Streams Common Lisp currently specifies only a very simple facility for binary input: the reading of a single byte as an integer. [Function] read-byte binary-input-stream &optional eof-error-p eof-value read-byte reads one byte from the binary-input-stream and returns it in the form of an integer. ------------------------------------------------------------------------------- 22.3. Output Functions The output functions are divided into two groups: those that operate on streams of characters and those that operate on streams of binary data. The function format operates on streams of characters but is described in a section separate from the other character-output functions because of its great complexity. ------------------------------------------------------------------------------- * Output to Character Streams * Output to Binary Streams * Formatted Output to Character Streams ------------------------------------------------------------------------------- 22.3.1. Output to Character Streams These functions all take an optional argument called output-stream, which is where to send the output. If unsupplied or nil, output-stream defaults to the value of the variable *standard-output*. If it is t, the value of the variable *terminal-io* is used. [old_change_begin] [Function] write object &key :stream :escape :radix :base :circle :pretty :level :length :case :gensym :array The printed representation of object is written to the output stream specified by :stream, which defaults to the value of *standard-output*. The other keyword arguments specify values used to control the generation of the printed representation. Each defaults to the value of the corresponding global variable: see *print-escape*, *print-radix*, *print-base*, *print-circle*, *print-pretty*, *print-level*, *print-length*, *print-case*, *print-array*, and *print-gensym*. (This is the means by which these variables affect printing operations: supplying default values for the write function.) Note that the printing of symbols is also affected by the value of the variable *package*. write returns object. [old_change_end] [change_begin] X3J13 voted in June 1989 (DATA-IO) to add the keyword argument :readably to the function write, and voted in June 1989 (PRETTY-PRINT-INTERFACE) to add the keyword arguments :right-margin, :miser-width, :lines, and :pprint-dispatch. The revised description is as follows. [Function] write object &key :stream :escape :radix :base :circle :pretty :level :length :case :gensym :array :readably :right-margin :miser-width :lines :pprint-dispatch The printed representation of object is written to the output stream specified by :stream, which defaults to the value of *standard-output*. The other keyword arguments specify values used to control the generation of the printed representation. Each defaults to the value of the corresponding global variable: see *print-escape*, *print-radix*, *print-base*, *print-circle*, *print-pretty*, *print-level*, *print-length*, and *print-case*, in addition to *print-array*, *print-gensym*, *print-readably*, *print-right-margin*, *print-miser-width*, *print-lines*, and *print-pprint-dispatch*. (This is the means by which these variables affect printing operations: supplying default values for the write function.) Note that the printing of symbols is also affected by the value of the variable *package*. write returns object. [change_end] [Function] prin1 object &optional output-stream print object &optional output-stream pprint object &optional output-stream princ object &optional output-stream prin1 outputs the printed representation of object to output-stream. Escape characters are used as appropriate. Roughly speaking, the output from prin1 is suitable for input to the function read. prin1 returns the object as its value. (prin1 object output-stream) == (write object :stream output-stream :escape t) print is just like prin1 except that the printed representation of object is preceded by a newline (see terpri) and followed by a space. print returns object. pprint is just like print except that the trailing space is omitted and the object is printed with the *print-pretty* flag non-nil to produce ``pretty'' output. pprint returns no values (that is, what the expression (values) returns: zero values). [change_begin] X3J13 voted in January 1989 (PRETTY-PRINT-INTERFACE) to adopt a facility for user-controlled pretty printing (see chapter 27). [change_end] princ is just like prin1 except that the output has no escape characters. A symbol is printed as simply the characters of its print name; a string is printed without surrounding double quotes; and there may be differences for other data types as well. The general rule is that output from princ is intended to look good to people, while output from prin1 is intended to be acceptable to the function read. [change_begin] X3J13 voted in June 1987 (PRINC-CHARACTER) to clarify that princ prints a character in exactly the same manner as write-char: the character is simply sent to the output stream. This was implied by the specification in section 22.1.6 in the first edition, but is worth pointing out explicitly here. [change_end] princ returns the object as its value. (princ object output-stream) == (write object :stream output-stream :escape nil) ------------------------------------------------------------------------------- Compatibility note: In MacLisp, the functions prin1, print, and princ return t, not the argument object. ------------------------------------------------------------------------------- [old_change_begin] [Function] write-to-string object &key :escape :radix :base :circle :pretty :level :length :case :gensym :array prin1-to-string object princ-to-string object The object is effectively printed as if by write, prin1, or princ, respectively, and the characters that would be output are made into a string, which is returned. [old_change_end] ------------------------------------------------------------------------------- Compatibility note: The Interlisp function mkstring corresponds to the Common Lisp function princ-to-string. ------------------------------------------------------------------------------- [change_begin] [Function] write-to-string object &key :escape :radix :base :circle :pretty :level :length :case :gensym :array :readably :right-margin :miser-width :lines :pprint-dispatch X3J13 voted in June 1989 ((DATA-IO) and (PRETTY-PRINT-INTERFACE) ) to add keyword arguments to write; presumably they should also be added to write-to-string. [change_end] [Function] write-char character &optional output-stream write-char outputs the character to output-stream, and returns character. [Function] write-string string &optional output-stream &key :start :end write-line string &optional output-stream &key :start :end write-string writes the characters of the specified substring of string to the output-stream. The :start and :end parameters delimit a substring of string in the usual manner (see chapter 14). write-line does the same thing but then outputs a newline afterwards. (See read-line.) In either case, the string is returned (not the substring delimited by :start and :end). In some implementations these may be much more efficient than an explicit loop using write-char. [Function] terpri &optional output-stream fresh-line &optional output-stream The function terpri outputs a newline to output-stream. It is identical in effect to (write-char #\Newline output-stream); however, terpri always returns nil. fresh-line is similar to terpri but outputs a newline only if the stream is not already at the start of a line. (If for some reason this cannot be determined, then a newline is output anyway.) This guarantees that the stream will be on a ``fresh line'' while consuming as little vertical distance as possible. fresh-line is a predicate that is true if it output a newline, and otherwise false. [Function] finish-output &optional output-stream force-output &optional output-stream clear-output &optional output-stream Some streams may be implemented in an asynchronous or buffered manner. The function finish-output attempts to ensure that all output sent to output-stream has reached its destination, and only then returns nil. force-output initiates the emptying of any internal buffers but returns nil without waiting for completion or acknowledgment. The function clear-output, on the other hand, attempts to abort any outstanding output operation in progress in order to allow as little output as possible to continue to the destination. This is useful, for example, to abort a lengthy output to the terminal when an asynchronous error occurs. clear-output returns nil. The precise actions of all three of these operations are implementation-dependent. [change_begin] [Macro] print-unreadable-object (object stream [[ :type type | :identity id ]]) {declaration}* {form}* X3J13 voted in June 1989 (DATA-IO) to add print-unreadable-object, which will output a printed representation of object on stream, beginning with #< and ending with >. Everything output to the stream during execution of the body forms is enclosed in the angle brackets. If type is true, the body output is preceded by a brief description of the object's type and a space character. If id is true, the body output is followed by a space character and a representation of the object's identity, typically a storage address. If *print-readably* is true, print-unreadable-object signals an error of type print-not-readable without printing anything. The object, stream, type, and id arguments are all evaluated normally. The type and id default to false. It is valid to provide no body forms. If type and id are both true and there are no body forms, only one space character separates the printed type and the printed identity. The value returned by print-unreadable-object is nil. (defmethod print-object ((obj airplane) stream) (print-unreadable-object (obj stream :type t :identity t) (princ (tail-number obj) stream))) (print my-airplane) prints #<Airplane NW0773 777500123135> ;In implementation A or perhaps #<FAA:AIRPLANE NW0773 17> ;In implementation B The big advantage of print-unreadable-object is that it allows a user to write print-object methods that adhere to implementation-specific style without requiring the user to write implementation-dependent code. The X3J13 vote left it unclear whether print-unreadable-object permits declarations to appear before the body of the macro call. I believe that was the intent, and this is reflected in the syntax shown above; but this is only my interpretation. [change_end] ------------------------------------------------------------------------------- 22.3.2. Output to Binary Streams Common Lisp currently specifies only a very simple facility for binary output: the writing of a single byte as an integer. [Function] write-byte integer binary-output-stream write-byte writes one byte, the value of integer. It is an error if integer is not of the type specified as the :element-type argument to open when the stream was created. The value integer is returned. ------------------------------------------------------------------------------- 22.3.3. Formatted Output to Character Streams The function format is very useful for producing nicely formatted text, producing good-looking messages, and so on. format can generate a string or output to a stream. Formatted output is performed not only by the format function itself but by certain other functions that accept a control string ``the way format does.'' For example, error-signaling functions such as cerror accept format control strings. [Function] format destination control-string &rest arguments format is used to produce formatted output. format outputs the characters of control-string, except that a tilde (~) introduces a directive. The character after the tilde, possibly preceded by prefix parameters and modifiers, specifies what kind of formatting is desired. Most directives use one or more elements of arguments to create their output; the typical directive puts the next element of arguments into the output, formatted in some special way. It is an error if no argument remains for a directive requiring an argument, but it is not an error if one or more arguments remain unprocessed by a directive. The output is sent to destination. If destination is nil, a string is created that contains the output; this string is returned as the value of the call to format. [change_begin] X3J13 voted in January 1989 (STREAM-ACCESS) to specify that when the first argument to format is nil, format creates a stream of type string-stream in much the same manner as with-output-to-string. (This stream may be visible to the user if, for example, the ~S directive is used to print a defstruct structure that has a user-supplied print function.) [change_end] In all other cases format returns nil, performing output to destination as a side effect. If destination is a stream, the output is sent to it. If destination is t, the output is sent to the stream that is the value of the variable *standard-output*. If destination is a string with a fill pointer, then in effect the output characters are added to the end of the string (as if by use of vector-push-extend). The format function includes some extremely complicated and specialized features. It is not necessary to understand all or even most of its features to use format effectively. The beginner should skip over anything in the following documentation that is not immediately useful or clear. The more sophisticated features (such as conditionals and iteration) are there for the convenience of programs with especially complicated formatting requirements. A format directive consists of a tilde (~), optional prefix parameters separated by commas, optional colon (:) and at-sign (@) modifiers, and a single character indicating what kind of directive this is. The alphabetic case of the directive character is ignored. The prefix parameters are generally integers, notated as optionally signed decimal numbers. [change_begin] X3J13 voted in June 1987 (FORMAT-ATSIGN-COLON) to specify that if both colon and at-sign modifiers are present, they may appear in either order; thus ~:@R and ~@:R mean the same thing. However, it is traditional to put the colon first, and all the examples in this book put colons before at-signs. [change_end] Examples of control strings: "~S" ;An ~S directive with no parameters or modifiers "~3,-4:@s" ;An ~S directive with two parameters, 3 and -4, ; and both the colon and at-sign flags "~,+4S" ;First prefix parameter is omitted and takes ; on its default value; the second parameter is 4 Sometimes a prefix parameter is used to specify a character, for instance the padding character in a right- or left-justifying operation. In this case a single quote (') followed by the desired character may be used as a prefix parameter, to mean the character object that is the character following the single quote. For example, you can use ~5,'0d to print an integer in decimal radix in five columns with leading zeros, or ~5,'*d to get leading asterisks. In place of a prefix parameter to a directive, you can put the letter V (or v), which takes an argument from arguments for use as a parameter to the directive. Normally this should be an integer or character object, as appropriate. This feature allows variable-width fields and the like. If the argument used by a V parameter is nil, the effect is as if the parameter had been omitted. You may also use the character # in place of a parameter; it represents the number of arguments remaining to be processed. It is an error to give a format directive more parameters than it is described here as accepting. It is also an error to give colon or at-sign modifiers to a directive in a combination not specifically described here as being meaningful. [change_begin] X3J13 voted in January 1989 (FORMAT-PRETTY-PRINT) to clarify the interaction between format and the various printer control variables (those named *print-xxx*). This is important because many format operations are defined, directly or indirectly, in terms of prin1 or princ, which are affected by the printer control variables. The general rule is that format does not bind any of the standard printer control variables except as specified in the individual descriptions of directives. An implementation may not bind any standard printer control variable not specified in the description of a format directive, nor may an implementation fail to bind any standard printer control variables that is specified to be bound by such a description. (See these descriptions for specific changes voted by X3J13.) One consequence of this change is that the user is guaranteed to be able to use the format ~A and ~S directives to do pretty printing, under control of the *print-pretty* variable. Implementations have differed on this point in their interpretations of the first edition. The new ~W directive may be more appropriate than either ~A and ~S for some purposes, whether for pretty printing or ordinary printing. See section 27.4 for a discussion of ~W and other new format directives related to pretty printing. [change_end] Here are some relatively simple examples to give you the general flavor of how format is used. (format nil "foo") => "foo" (setq x 5) (format nil "The answer is ~D." x) => "The answer is 5." (format nil "The answer is ~3D." x) => "The answer is 5." (format nil "The answer is ~3,'0D." x) => "The answer is 005." (format nil "The answer is ~:D." (expt 47 x)) => "The answer is 229,345,007." (setq y "elephant") (format nil "Look at the ~A!" y) => "Look at the elephant!" (format nil "Type ~:C to ~A." (set-char-bit #\D :control t) "delete all your files") => "Type Control-D to delete all your files." (setq n 3) (format nil "~D item~:P found." n) => "3 items found." (format nil "~R dog~:[s are~; is~] here." n (= n 1)) => "three dogs are here." (format nil "~R dog~:*~[s are~; is~:;s are~] here." n) => "three dogs are here." (format nil "Here ~[are~;is~:;are~] ~:*~R pupp~:@P." n) => "Here are three puppies." In the descriptions of the directives that follow, the term arg in general refers to the next item of the set of arguments to be processed. The word or phrase at the beginning of each description is a mnemonic (not necessarily an accurate one) for the directive. ~A Ascii. An arg, any Lisp object, is printed without escape characters (as by princ). In particular, if arg is a string, its characters will be output verbatim. If arg is nil, it will be printed as nil; the colon modifier (~:A) will cause an arg of nil to be printed as (), but if arg is a composite structure, such as a list or vector, any contained occurrences of nil will still be printed as nil. ~mincolA inserts spaces on the right, if necessary, to make the width at least mincol columns. The @ modifier causes the spaces to be inserted on the left rather than the right. ~mincol,colinc,minpad,padcharA is the full form of ~A, which allows elaborate control of the padding. The string is padded on the right (or on the left if the @ modifier is used) with at least minpad copies of padchar; padding characters are then inserted colinc characters at a time until the total width is at least mincol. The defaults are 0 for mincol and minpad, 1 for colinc, and the space character for padchar. [change_begin] X3J13 voted in January 1989 (FORMAT-PRETTY-PRINT) to specify that format binds *print-escape* to nil during the processing of the ~A directive. [change_end] ~S S-expression. This is just like ~A, but arg is printed with escape characters (as by prin1 rather than princ). The output is therefore suitable for input to read. ~S accepts all the arguments and modifiers that ~A does. [change_begin] X3J13 voted in January 1989 (FORMAT-PRETTY-PRINT) to specify that format binds *print-escape* to t during the processing of the ~S directive. [change_end] ~D Decimal. An arg, which should be an integer, is printed in decimal radix. ~D will never put a decimal point after the number. ~mincolD uses a column width of mincol; spaces are inserted on the left if the number requires fewer than mincol columns for its digits and sign. If the number doesn't fit in mincol columns, additional columns are used as needed. ~mincol,padcharD uses padchar as the pad character instead of space. If arg is not an integer, it is printed in ~A format and decimal base. [change_begin] X3J13 voted in January 1989 (FORMAT-PRETTY-PRINT) to specify that format binds *print-escape* to nil, *print-radix* to nil, and *print-base* to 10 during processing of ~D. [change_end] The @ modifier causes the number's sign to be printed always; the default is to print it only if the number is negative. The : modifier causes commas to be printed between groups of three digits; the third prefix parameter may be used to change the character used as the comma. Thus the most general form of ~D is ~mincol,padchar,commacharD. [change_begin] X3J13 voted in March 1988 (FORMAT-COMMA-INTERVAL) to add a fourth parameter, the commainterval. This must be an integer; if it is not provided, it defaults to 3. This parameter controls the number of digits in each group separated by the commachar. By extension, each of the ~B, ~O, and ~X directives accepts a commainterval as a fourth parameter, and the ~R directive accepts a commainterval as its fifth parameter. Examples: (format nil "~,,' ,4B" #xFACE) => "1111 1010 1100 1110" (format nil "~,,' ,4B" #x1CE) => "1 1100 1110" (format nil "~19,,' ,4B" #xFACE) => "1111 1010 1100 1110" (format nil "~19,,' ,4B" #x1CE) => "0000 0001 1100 1110" This is one of those little improvements that probably don't matter much but aren't hard to implement either. It was pretty silly having the number 3 wired into the definition of comma separation when it is just as easy to make it a parameter. [change_end] ~B Binary. This is just like ~D but prints in binary radix (radix 2) instead of decimal. The full form is therefore ~mincol,padchar,commacharB. [change_begin] X3J13 voted in January 1989 (FORMAT-PRETTY-PRINT) to specify that format binds *print-escape* to nil, *print-radix* to nil, and *print-base* to 2 during processing of ~B. [change_end] ~O Octal. This is just like ~D but prints in octal radix (radix 8) instead of decimal. The full form is therefore ~mincol,padchar,commacharO. [change_begin] X3J13 voted in January 1989 (FORMAT-PRETTY-PRINT) to specify that format binds *print-escape* to nil, *print-radix* to nil, and *print-base* to 8 during processing of ~O. [change_end] ~X Hexadecimal. This is just like ~D but prints in hexadecimal radix (radix 16) instead of decimal. The full form is therefore ~mincol,padchar,commacharX. [change_begin] X3J13 voted in January 1989 (FORMAT-PRETTY-PRINT) to specify that format binds *print-escape* to nil, *print-radix* to nil, and *print-base* to 16 during processing of ~X. [change_end] ------------------------------------------------------------------------------- Compatibility note: In MacLisp and Lisp Machine Lisp the ~X directive outputs a space, and ~nX outputs n spaces, in a manner analogous to Fortran X format. In Common Lisp the directive ~@T is used for that purpose. ------------------------------------------------------------------------------- ~R Radix. ~nR prints arg in radix n. The modifier flags and any remaining parameters are used as for the ~D directive. Indeed, ~D is the same as ~10R. The full form here is therefore ~radix,mincol,padchar,commacharR. [change_begin] X3J13 voted in January 1989 (FORMAT-PRETTY-PRINT) to specify that format binds *print-escape* to nil, *print-radix* to nil, and *print-base* to the value of the first parameter during the processing of the ~R directive with a parameter. [change_end] If no parameters are given to ~R, then an entirely different interpretation is given. [change_begin] Notice of correction. In the first edition, this sentence referred to ``arguments'' given to ~R. The correct term is ``parameters.'' [change_end] The argument should be an integer; suppose it is 4. Then ~R prints arg as a cardinal English number: four; ~:R prints arg as an ordinal English number: fourth; ~@R prints arg as a Roman numeral: IV; and ~:@R prints arg as an old Roman numeral: IIII. [change_begin] X3J13 voted in January 1989 (FORMAT-PRETTY-PRINT) to specify that format binds *print-base* to 10 during the processing of the ~R directive with no parameter. The first edition did not specify how ~R and its variants should handle arguments that are very large or not positive. Actual practice varies, and X3J13 has not yet addressed the topic. Here is a sampling of current practice. For ~@R and ~:@R, nearly all implementations produce Roman numerals only for integers in the range 1 to 3999, inclusive. Some implementations will produce old-style Roman numerals for integers in the range 1 to 4999, inclusive. All other integers are printed in decimal notation, as if ~D had been used. For zero, most implementations print zero for ~R and zeroth for ~:R. For ~R with a negative argument, most implementations simply print the word minus followed by its absolute value as a cardinal in English. For ~:R with a negative argument, some implementations also print the word minus followed by its absolute value as an ordinal in English; other implementations print the absolute value followed by the word previous. Thus the argument -4 might produce minus fourth or fourth previous. Each has its charm, but one is not always a suitable substitute for the other; users should be careful. There is standard English nomenclature for fairly large integers (up to , at least), based on appending the suffix -illion to Latin names of integers. Thus we have the names trillion, quadrillion, sextillion, septillion, and so on. For extremely large integers, one may express powers of ten in English. One implementation gives 1606938044258990275541962092341162602522202993782792835301376 (which is , the result of (ash 1 200)) in this manner: one times ten to the sixtieth power six hundred six times ten to the fifty-seventh power nine hundred thirty-eight septdecillion forty-four sexdecillion two hundred fifty-eight quindecillion nine hundred ninety quattuordecillion two hundred seventy-five tredecillion five hundred forty-one duodecillion nine hundred sixty-two undecillion ninety-two decillion three hundred forty-one nonillion one hundred sixty-two octillion six hundred two septillion five hundred twenty-two sextillion two hundred two quintillion nine hundred ninety-three quadrillion seven hundred eighty-two trillion seven hundred ninety-two billion eight hundred thirty-five million three hundred one thousand three hundred seventy-six Another implementation prints it this way (note the use of plus): one times ten to the sixtieth power plus six hundred six times ten to the fifty-seventh power plus ... plus two hundred seventy-five times ten to the forty-second power plus five hundred forty-one duodecillion nine hundred sixty-two undecillion ... three hundred seventy-six (I have elided some of the text here to save space.) Unfortunately, the meaning of this nomenclature differs between American English (in which k-illion means , so one trillion is ) and British English (in which k-illion means , so one trillion is ). To avoid both confusion and prolixity, I recommend using decimal notation for all numbers above 999,999,999; this is similar to the escape hatch used for Roman numerals. [change_end] ~P Plural. If arg is not eql to the integer 1, a lowercase s is printed; if arg is eql to 1, nothing is printed. (Notice that if arg is a floating-point 1.0, the s is printed.) ~:P does the same thing, after doing a ~:* to back up one argument; that is, it prints a lowercase s if the last argument was not 1. This is useful after printing a number using ~D. ~@P prints y if the argument is 1, or ies if it is not. ~:@P does the same thing, but backs up first. (format nil "~D tr~:@P/~D win~:P" 7 1) => "7 tries/1 win" (format nil "~D tr~:@P/~D win~:P" 1 0) => "1 try/0 wins" (format nil "~D tr~:@P/~D win~:P" 1 3) => "1 try/3 wins" ~C Character. The next arg should be a character; it is printed according to the modifier flags. [old_change_begin] ~C prints the character in an implementation-dependent abbreviated format. This format should be culturally compatible with the host environment. [old_change_end] [change_begin] X3J13 voted in June 1987 (FORMAT-OP-C) to specify that ~C performs exactly the same action as write-char if the character to be printed has zero for its bits attributes. X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate the bits and font attributes, replacing them with the notion of implementation-defined attributes. The net effect is that characters whose implementation-defined attributes all have the ``standard'' values should be printed by ~C in the same way that write-char would print them. [change_end] ~:C spells out the names of the control bits and represents non-printing characters by their names: Control-Meta-F, Control-Return, Space. This is a ``pretty'' format for printing characters. ~:@C prints what ~:C would, and then if the character requires unusual shift keys on the keyboard to type it, this fact is mentioned: Control- (Top-F). This is the format for telling the user about a key he or she is expected to type, in prompts, for instance. The precise output may depend not only on the implementation but on the particular I/O devices in use. ~@C prints the character so that the Lisp reader can read it, using #\ syntax. [change_begin] X3J13 voted in January 1989 (FORMAT-PRETTY-PRINT) to specify that format binds *print-escape* to t during the processing of the ~@C directive. Other variants of the ~C directive do not bind any printer control variables. [change_end] ------------------------------------------------------------------------------- Rationale: In some implementations the ~S directive would do what ~C does, but ~C is compatible with Lisp dialects such as MacLisp that do not have a character data type. ------------------------------------------------------------------------------- ~F Fixed-format floating-point. The next arg is printed as a floating-point number. The full form is ~w,d,k,overflowchar,padcharF. The parameter w is the width of the field to be printed; d is the number of digits to print after the decimal point; k is a scale factor that defaults to zero. Exactly w characters will be output. First, leading copies of the character padchar (which defaults to a space) are printed, if necessary, to pad the field on the left. If the arg is negative, then a minus sign is printed; if the arg is not negative, then a plus sign is printed if and only if the @ modifier was specified. Then a sequence of digits, containing a single embedded decimal point, is printed; this represents the magnitude of the value of arg times , rounded to d fractional digits. (When rounding up and rounding down would produce printed values equidistant from the scaled value of arg, then the implementation is free to use either one. For example, printing the argument 6.375 using the format ~4,2F may correctly produce either 6.37 or 6.38.) Leading zeros are not permitted, except that a single zero digit is output before the decimal point if the printed value is less than 1, and this single zero digit is not output after all if w=d+1. If it is impossible to print the value in the required format in a field of width w, then one of two actions is taken. If the parameter overflowchar is specified, then w copies of that parameter are printed instead of the scaled value of arg. If the overflowchar parameter is omitted, then the scaled value is printed using more than w characters, as many more as may be needed. If the w parameter is omitted, then the field is of variable width. In effect, a value is chosen for w in such a way that no leading pad characters need to be printed and exactly d characters will follow the decimal point. For example, the directive ~,2F will print exactly two digits after the decimal point and as many as necessary before the decimal point. If the parameter d is omitted, then there is no constraint on the number of digits to appear after the decimal point. A value is chosen for d in such a way that as many digits as possible may be printed subject to the width constraint imposed by the parameter w and the constraint that no trailing zero digits may appear in the fraction, except that if the fraction to be printed is zero, then a single zero digit should appear after the decimal point if permitted by the width constraint. If both w and d are omitted, then the effect is to print the value using ordinary free-format output; prin1 uses this format for any number whose magnitude is either zero or between (inclusive) and (exclusive). If w is omitted, then if the magnitude of arg is so large (or, if d is also omitted, so small) that more than 100 digits would have to be printed, then an implementation is free, at its discretion, to print the number using exponential notation instead, as if by the directive ~E (with all parameters to ~E defaulted, not taking their values from the ~F directive). If arg is a rational number, then it is coerced to be a single-float and then printed. (Alternatively, an implementation is permitted to process a rational number by any other method that has essentially the same behavior but avoids such hazards as loss of precision or overflow because of the coercion. However, note that if w and d are unspecified and the number has no exact decimal representation, for example 1/3, some precision cutoff must be chosen by the implementation: only a finite number of digits may be printed.) If arg is a complex number or some non-numeric object, then it is printed using the format directive ~wD, thereby printing it in decimal radix and a minimum field width of w. (If it is desired to print each of the real part and imaginary part of a complex number using a ~F directive, then this must be done explicitly with two ~F directives and code to extract the two parts of the complex number.) [change_begin] X3J13 voted in January 1989 (FORMAT-PRETTY-PRINT) to specify that format binds *print-escape* to nil during the processing of the ~F directive. [change_end] (defun foo (x) (format nil "~6,2F|~6,2,1,'*F|~6,2,,'?F|~6F|~,2F|~F" x x x x x x)) (foo 3.14159) => " 3.14| 31.42| 3.14|3.1416|3.14|3.14159" (foo -3.14159) => " -3.14|-31.42| -3.14|-3.142|-3.14|-3.14159" (foo 100.0) => "100.00|******|100.00| 100.0|100.00|100.0" (foo 1234.0) => "1234.00|******|??????|1234.0|1234.00|1234.0" (foo 0.006) => " 0.01| 0.06| 0.01| 0.006|0.01|0.006" ------------------------------------------------------------------------------- Compatibility note: The ~F directive is similar to the Fw.d edit descriptor in Fortran. The presence or absence of the @ modifier corresponds to the effect of the Fortran SS or SP edit descriptor; nothing in Common Lisp corresponds to the Fortran S edit descriptor. The scale factor specified by the parameter k corresponds to the scale factor k specified by the Fortran kP edit descriptor. In Fortran, the leading zero that precedes the decimal point when the printed value is less than 1 is optional; in Common Lisp, the implementation is required to print that zero digit. In Common Lisp, the w and d parameters are optional; in Fortran, they are required. In Common Lisp, the pad character and overflow character are user-specifiable; in Fortran, they are always space and asterisk, respectively. A Fortran implementation is prohibited from printing a representation of negative zero; Common Lisp permits the printing of such a representation when appropriate. In MacLisp and Lisp Machine Lisp, the ~F format directive takes a single parameter: the number of digits to use in the printed representation. This incompatibility between Common Lisp and MacLisp was introduced for the sake of cultural compatibility with Fortran. ------------------------------------------------------------------------------- ~E Exponential floating-point. The next arg is printed in exponential notation. The full form is ~w,d,e,k,overflowchar,padchar,exponentcharE. The parameter w is the width of the field to be printed; d is the number of digits to print after the decimal point; e is the number of digits to use when printing the exponent; k is a scale factor that defaults to 1 (not zero). Exactly w characters will be output. First, leading copies of the character padchar (which defaults to a space) are printed, if necessary, to pad the field on the left. If the arg is negative, then a minus sign is printed; if the arg is not negative, then a plus sign is printed if and only if the @ modifier was specified. Then a sequence of digits, containing a single embedded decimal point, is printed. The form of this sequence of digits depends on the scale factor k. If k is zero, then d digits are printed after the decimal point, and a single zero digit appears before the decimal point if the total field width will permit it. If k is positive, then it must be strictly less than d+2; k significant digits are printed before the decimal point, and d-k+1 digits are printed after the decimal point. If k is negative, then it must be strictly greater than -d; a single zero digit appears before the decimal point if the total field width will permit it, and after the decimal point are printed first -k zeros and then d+k significant digits. The printed fraction must be properly rounded. (When rounding up and rounding down would produce printed values equidistant from the scaled value of arg, then the implementation is free to use either one. For example, printing 637.5 using the format ~8,2E may correctly produce either 6.37E+02 or 6.38E+02.) Following the digit sequence, the exponent is printed. First the character parameter exponentchar is printed; if this parameter is omitted, then the exponent marker that prin1 would use is printed, as determined from the type of the floating-point number and the current value of *read-default-float-format*. Next, either a plus sign or a minus sign is printed, followed by e digits representing the power of 10 by which the printed fraction must be multiplied to properly represent the rounded value of arg. If it is impossible to print the value in the required format in a field of width w, possibly because k is too large or too small or because the exponent cannot be printed in e character positions, then one of two actions is taken. If the parameter overflowchar is specified, then w copies of that parameter are printed instead of the scaled value of arg. If the overflowchar parameter is omitted, then the scaled value is printed using more than w characters, as many more as may be needed; if the problem is that d is too small for the specified k or that e is too small, then a larger value is used for d or e as may be needed. If the w parameter is omitted, then the field is of variable width. In effect a value is chosen for w in such a way that no leading pad characters need to be printed. If the parameter d is omitted, then there is no constraint on the number of digits to appear. A value is chosen for d in such a way that as many digits as possible may be printed subject to the width constraint imposed by the parameter w, the constraint of the scale factor k, and the constraint that no trailing zero digits may appear in the fraction, except that if the fraction to be printed is zero, then a single zero digit should appear after the decimal point if the width constraint allows it. If the parameter e is omitted, then the exponent is printed using the smallest number of digits necessary to represent its value. If all of w, d, and e are omitted, then the effect is to print the value using ordinary free-format exponential-notation output; prin1 uses this format for any non-zero number whose magnitude is less than or greater than or equal to . [change_begin] X3J13 voted in January 1989 (FORMAT-E-EXPONENT-SIGN) to amend the previous paragraph as follows: If all of w, d, and e are omitted, then the effect is to print the value using ordinary free-format exponential-notation output; prin1 uses a similar format for any non-zero number whose magnitude is less than or greater than or equal to . The only difference is that the ~E directive always prints a plus or minus sign before the exponent, while prin1 omits the plus sign if the exponent is non-negative. (The amendment reconciles this paragraph with the specification several paragraphs above that ~E always prints a plus or minus sign before the exponent.) [change_end] If arg is a rational number, then it is coerced to be a single-float and then printed. (Alternatively, an implementation is permitted to process a rational number by any other method that has essentially the same behavior but avoids such hazards as loss of precision or overflow because of the coercion. However, note that if w and d are unspecified and the number has no exact decimal representation, for example 1/3, some precision cutoff must be chosen by the implementation: only a finite number of digits may be printed.) If arg is a complex number or some non-numeric object, then it is printed using the format directive ~wD, thereby printing it in decimal radix and a minimum field width of w. (If it is desired to print each of the real part and imaginary part of a complex number using a ~E directive, then this must be done explicitly with two ~E directives and code to extract the two parts of the complex number.) [change_begin] X3J13 voted in January 1989 (FORMAT-PRETTY-PRINT) to specify that format binds *print-escape* to nil during the processing of the ~E directive. [change_end] (defun foo (x) (format nil "~9,2,1,,'*E|~10,3,2,2,'?,,'$E|~9,3,2,-2,'%@E|~9,2E" x x x x)) (foo 3.14159) => " 3.14E+0| 31.42$-01|+.003E+03| 3.14E+0" (foo -3.14159) => " -3.14E+0|-31.42$-01|-.003E+03| -3.14E+0" (foo 1100.0) => " 1.10E+3| 11.00$+02|+.001E+06| 1.10E+3" (foo 1100.0L0) => " 1.10L+3| 11.00$+02|+.001L+06| 1.10L+3" (foo 1.1E13) => "*********| 11.00$+12|+.001E+16| 1.10E+13" (foo 1.1L120) => "*********|??????????|%%%%%%%%%|1.10L+120" (foo 1.1L1200) => "*********|??????????|%%%%%%%%%|1.10L+1200" Here is an example of the effects of varying the scale factor: (dotimes (k 13) (format t " %Scale factor 2D: | 13,6,2,VE|" (- k 5) 3.14159)) ;Prints 13 lines Scale factor -5: | 0.000003E+06| Scale factor -4: | 0.000031E+05| Scale factor -3: | 0.000314E+04| Scale factor -2: | 0.003142E+03| Scale factor -1: | 0.031416E+02| Scale factor 0: | 0.314159E+01| Scale factor 1: | 3.141590E+00| Scale factor 2: | 31.41590E-01| Scale factor 3: | 314.1590E-02| Scale factor 4: | 3141.590E-03| Scale factor 5: | 31415.90E-04| Scale factor 6: | 314159.0E-05| Scale factor 7: | 3141590.E-06| ------------------------------------------------------------------------------- Compatibility note: The ~E directive is similar to the Ew.d and Ew.dEe edit descriptors in Fortran. The presence or absence of the @ modifier corresponds to the effect of the Fortran SS or SP edit descriptor; nothing in Common Lisp corresponds to the Fortran S edit descriptor. The scale factor specified by the parameter k corresponds to the scale factor k specified by the Fortran kP edit descriptor; note, however, that the default value for k is 1 in Common Lisp, as opposed to the default value of zero in Fortran. (On the other hand, note that a scale factor of 1 is used for Fortran list-directed output, which is roughly equivalent to using ~E with the w, d, e, and overflowchar parameters omitted.) In Common Lisp, the w and d parameters are optional; in Fortran, they are required. In Fortran, omitting e causes the exponent to be printed using either two or three digits; if three digits are required, then the exponent marker is omitted. In Common Lisp, omitting e causes the exponent to be printed using as few digits as possible; the exponent marker is never omitted. In Common Lisp, the pad character and overflow character are user-specifiable; in Fortran they are always space and asterisk, respectively. A Fortran implementation is prohibited from printing a representation of negative zero; Common Lisp permits the printing of such a representation when appropriate. In MacLisp and Lisp Machine Lisp, the ~E format directive takes a single parameter: the number of digits to use in the printed representation. This incompatibility between Common Lisp and MacLisp was introduced for the sake of cultural compatibility with Fortran. ------------------------------------------------------------------------------- ~G General floating-point. The next arg is printed as a floating-point number in either fixed-format or exponential notation as appropriate. The full form is ~w,d,e,k,overflowchar,padchar,exponentcharG. The format in which to print arg depends on the magnitude (absolute value) of the arg. Let n be an integer such that . (If arg is zero, let n be 0.) Let ee equal e+2, or 4 if e is omitted. Let ww equal w-ee, or nil if w is omitted. If d is omitted, first let q be the number of digits needed to print arg with no loss of information and without leading or trailing zeros; then let d equal (max q (min n 7)). Let dd equal d-n. If 0ddd, then arg is printed as if by the format directives ~ww,dd,,overflowchar,padcharF~ee@T Note that the scale factor k is not passed to the ~F directive. For all other values of dd, arg is printed as if by the format directive ~w,d,e,k,overflowchar,padchar,exponentcharE In either case, an @ modifier is specified to the ~F or ~E directive if and only if one was specified to the ~G directive. [change_begin] X3J13 voted in January 1989 (FORMAT-PRETTY-PRINT) to specify that format binds *print-escape* to nil during the processing of the ~G directive. [change_end] [old_change_begin] Examples: (defun foo (x) (format nil "~9,2,1,,'*G|~9,3,2,3,'?,,'$G|~9,3,2,0,'%G|~9,2G" x x x)) (foo 0.0314159) => " 3.14E-2|314.2$-04|0.314E-01| 3.14E-2" (foo 0.314159) => " 0.31 |0.314 |0.314 | 0.31 " (foo 3.14159) => " 3.1 | 3.14 | 3.14 | 3.1 " (foo 31.4159) => " 31. | 31.4 | 31.4 | 31. " (foo 314.159) => " 3.14E+2| 314. | 314. | 3.14E+2" (foo 3141.59) => " 3.14E+3|314.2$+01|0.314E+04| 3.14E+3" (foo 3141.59L0) => " 3.14L+3|314.2$+01|0.314L+04| 3.14L+3" (foo 3.14E12) => "*********|314.0$+10|0.314E+13| 3.14E+12" (foo 3.14L120) => "*********|?????????|%%%%%%%%%|3.14L+120" (foo 3.14L1200) => "*********|?????????|%%%%%%%%%|3.14L+1200" [old_change_end] [change_begin] Notice of correction. In the first edition, the example for the value 3.14E12 contained two typographical errors: (foo 3.14E12) => "*********|314.2$+10|0.314E+13| 3.14L+12" ^ ^ should be 0 should be E These have been corrected above. [change_end] ------------------------------------------------------------------------------- Compatibility note: The ~G directive is similar to the Gw.d edit descriptor in Fortran. The Common Lisp rules for deciding between the use of ~F and ~E are compatible with the rules used by Fortran but have been extended to cover the cases where w or d is omitted or where e is specified. In MacLisp and Lisp Machine Lisp, the ~G format directive is equivalent to the Common Lisp ~@* directive. This incompatibility between Common Lisp and MacLisp was introduced for the sake of cultural compatibility with Fortran. ------------------------------------------------------------------------------- ~$ Dollars floating-point. The next arg is printed as a floating-point number in fixed-format notation. This format is particularly convenient for printing a value as dollars and cents. The full form is ~d,n,w,padchar$. The parameter d is the number of digits to print after the decimal point (default value 2); n is the minimum number of digits to print before the decimal point (default value 1); w is the minimum total width of the field to be printed (default value 0). First padding and the sign are output. If the arg is negative, then a minus sign is printed; if the arg is not negative, then a plus sign is printed if and only if the @ modifier was specified. If the : modifier is used, the sign appears before any padding, and otherwise after the padding. If w is specified and the number of other characters to be output is less than w, then copies of padchar (which defaults to a space) are output to make the total field width equal w. Then n digits are printed for the integer part of arg, with leading zeros if necessary; then a decimal point; then d digits of fraction, properly rounded. If the magnitude of arg is so large that more than m digits would have to be printed, where m is the larger of w and 100, then an implementation is free, at its discretion, to print the number using exponential notation instead, as if by the directive ~w,q,,,,padcharE, where w and padchar are present or omitted according to whether they were present or omitted in the ~$ directive, and where q=d+n-1, where d and n are the (possibly default) values given to the ~$ directive. If arg is a rational number, then it is coerced to be a single-float and then printed. (Alternatively, an implementation is permitted to process a rational number by any other method that has essentially the same behavior but avoids such hazards as loss of precision or overflow because of the coercion.) If arg is a complex number or some non-numeric object, then it is printed using the format directive ~wD, thereby printing it in decimal radix and a minimum field width of w. (If it is desired to print each of the real part and imaginary part of a complex number using a ~$ directive, then this must be done explicitly with two ~$ directives and code to extract the two parts of the complex number.) [change_begin] X3J13 voted in January 1989 (FORMAT-PRETTY-PRINT) to specify that format binds *print-escape* to nil during the processing of the ~$ directive. [change_end] ~% This outputs a #\Newline character, thereby terminating the current output line and beginning a new one (see terpri). ~n% outputs n newlines. No arg is used. Simply putting a newline in the control string would work, but ~% is often used because it makes the control string look nicer in the middle of a Lisp program. ~& Unless it can be determined that the output stream is already at the beginning of a line, this outputs a newline (see fresh-line). ~n& calls fresh-line and then outputs n-1 newlines. ~0& does nothing. ~| This outputs a page separator character, if possible. ~n| does this n times. | is vertical bar, not capital I. ~~ Tilde. This outputs a tilde. ~n~ outputs n tildes. ~<newline> Tilde immediately followed by a newline ignores the newline and any following non-newline whitespace characters. With a :, the newline is ignored, but any following whitespace is left in place. With an @, the newline is left in place, but any following whitespace is ignored. This directive is typically used when a format control string is too long to fit nicely into one line of the program: (defun type-clash-error (fn nargs argnum right-type wrong-type) (format *error-output* "~&Function ~S requires its ~:[~:R~;~*~] ~ argument to be of type ~S,~%but it was called ~ with an argument of type ~S.~%" fn (eql nargs 1) argnum right-type wrong-type)) (type-clash-error 'aref nil 2 'integer 'vector) prints: Function AREF requires its second argument to be of type INTEGER, but it was called with an argument of type VECTOR. (type-clash-error 'car 1 1 'list 'short-float) prints: Function CAR requires its argument to be of type LIST, but it was called with an argument of type SHORT-FLOAT. Note that in this example newlines appear in the output only as specified by the ~& and ~% directives; the actual newline characters in the control string are suppressed because each is preceded by a tilde. ~T Tabulate. This spaces over to a given column. ~colnum,colincT will output sufficient spaces to move the cursor to column colnum. If the cursor is already at or beyond column colnum, it will output spaces to move it to column colnum+k*colinc for the smallest positive integer k possible, unless colinc is zero, in which case no spaces are output if the cursor is already at or beyond column colnum. colnum and colinc default to 1. Ideally, the current column position is determined by examination of the destination, whether a stream or string. (Although no user-level operation for determining the column position of a stream is defined by Common Lisp, such a facility may exist at the implementation level.) If for some reason the current absolute column position cannot be determined by direct inquiry, format may be able to deduce the current column position by noting that certain directives (such as ~%, or ~&, or ~A with the argument being a string containing a newline) cause the column position to be reset to zero, and counting the number of characters emitted since that point. If that fails, format may attempt a similar deduction on the riskier assumption that the destination was at column zero when format was invoked. If even this heuristic fails or is implementationally inconvenient, at worst the ~T operation will simply output two spaces. (All this implies that code that uses format is more likely to be portable if all format control strings that use the ~T directive either begin with ~% or ~& to force a newline or are designed to be used only when the destination is known from other considerations to be at column zero.) ~@T performs relative tabulation. ~colrel,colinc@T outputs colrel spaces and then outputs the smallest non-negative number of additional spaces necessary to move the cursor to a column that is a multiple of colinc. For example, the directive ~3,8@T outputs three spaces and then moves the cursor to a ``standard multiple-of-eight tab stop'' if not at one already. If the current output column cannot be determined, however, then colinc is ignored, and exactly colrel spaces are output. [change_begin] X3J13 voted in June 1989 (PRETTY-PRINT-INTERFACE) to define ~:T and ~:@T to perform tabulation relative to a point defined by the pretty printing process (see section 27.4). [change_end] ~* The next arg is ignored. ~n* ignores the next n arguments. ~:* ``ignores backwards''; that is, it backs up in the list of arguments so that the argument last processed will be processed again. ~n:* backs up n arguments. When within a ~{ construct (see below), the ignoring (in either direction) is relative to the list of arguments being processed by the iteration. ~n@* is an ``absolute goto'' rather than a ``relative goto'': it goes to the nth arg, where 0 means the first one; n defaults to 0, so ~@* goes back to the first arg. Directives after a ~n@* will take arguments in sequence beginning with the one gone to. When within a ~{ construct, the ``goto'' is relative to the list of arguments being processed by the iteration. ~? Indirection. The next arg must be a string, and the one after it a list; both are consumed by the ~? directive. The string is processed as a format control string, with the elements of the list as the arguments. Once the recursive processing of the control string has been finished, then processing of the control string containing the ~? directive is resumed. Example: (format nil "~? ~D" "<~A ~D>" '("Foo" 5) 7) => "<Foo 5> 7" (format nil "~? ~D" "<~A ~D>" '("Foo" 5 14) 7) => "<Foo 5> 7" Note that in the second example three arguments are supplied to the control string "<~A ~D>", but only two are processed and the third is therefore ignored. With the @ modifier, only one arg is directly consumed. The arg must be a string; it is processed as part of the control string as if it had appeared in place of the ~@? construct, and any directives in the recursively processed control string may consume arguments of the control string containing the ~@? directive. Example: (format nil "~@? ~D" "<~A ~D>" "Foo" 5 7) => "<Foo 5> 7" (format nil "~@? ~D" "<~A ~D>" "Foo" 5 14 7) => "<Foo 5> 14" Here is a rather sophisticated example. The format function itself, as implemented at one time in Lisp Machine Lisp, used a routine internal to the format package called format-error to signal error messages; format-error in turn used error, which used format recursively. Now format-error took a string and arguments, just like format, but also printed the control string to format (which at this point was available in the global variable *ctl-string*) and a little arrow showing where in the processing of the control string the error occurred. The variable *ctl-index* pointed one character after the place of the error. (defun format-error (string &rest args) ;Example (error nil "~?~%~V@T ~%~3@T "~A "~%" string args (+ *ctl-index* 3) *ctl-string*)) (The character set used in the Lisp Machine Lisp implementation contains a down-arrow character , which is not a standard Common Lisp character.) This first processed the given string and arguments using ~?, then output a newline, tabbed a variable amount for printing the down-arrow, and printed the control string between double quotes (note the use of " to include double quotes within the control string). The effect was something like this: (format t "The item is a ~[Foo~;Bar~;Loser~]." 'quux) >>ERROR: The argument to the FORMAT "~[" command must be a number. "The item is a ~[Foo~;Bar~;Loser~]." ------------------------------------------------------------------------------- Implementation note: Implementors may wish to report errors occurring within format control strings in the manner outlined here. It looks pretty flashy when done properly. ------------------------------------------------------------------------------- [change_begin] X3J13 voted in June 1989 (PRETTY-PRINT-INTERFACE) to introduce certain format directives to support the user interface to the pretty printer described in detail in chapter 27. ~_ Conditional newline. Without any modifiers, the directive ~_ is equivalent to (pprint-newline :linear). The directive ~@_ is equivalent to (pprint-newline :miser). The directive ~:_ is equivalent to (pprint-newline :fill). The directive ~:@_ is equivalent to (pprint-newline :mandatory). ~W Write. An arg, any Lisp object, is printed obeying every printer control variable (as by write). See section 27.4 for details. ~I Indent. The directive ~nI is equivalent to (pprint-indent :block n). The directive ~:nI is equivalent to (pprint-indent :current n). In both cases, n defaults to zero, if it is omitted. [change_end] The format directives after this point are much more complicated than the foregoing; they constitute control structures that can perform case conversion, conditional selection, iteration, justification, and non-local exits. Used with restraint, they can perform powerful tasks. Used with abandon, they can produce completely unreadable and unmaintainable code. The case-conversion, conditional, iteration, and justification constructs can contain other formatting constructs by bracketing them. These constructs must nest properly with respect to each other. For example, it is not legitimate to put the start of a case-conversion construct in each arm of a conditional and the end of the case-conversion construct outside the conditional: (format nil "~:[abc~:@(def~;ghi~:@(jkl~]mno~)" x) ;Illegal! One might expect this to produce either "abcDEFMNO" or "ghiJKLMNO", depending on whether x is false or true; but in fact the construction is illegal because the ~[...~;...~] and ~(...~) constructs are not properly nested. The processing indirection caused by the ~? directive is also a kind of nesting for the purposes of this rule of proper nesting. It is not permitted to start a bracketing construct within a string processed under control of a ~? directive and end the construct at some point after the ~? construct in the string containing that construct, or vice versa. For example, this situation is illegal: (format nil "~?ghi~)" "abc~@(def") ;Illegal! One might expect it to produce "abcDEFGHI", but in fact the construction is illegal because the ~? and ~(...~) constructs are not properly nested. ~(str~) Case conversion. The contained control string str is processed, and what it produces is subject to case conversion: ~( converts every uppercase character to the corresponding lowercase character; ~:( capitalizes all words, as if by string-capitalize; ~@( capitalizes just the first word and forces the rest to lowercase; ~:@( converts every lowercase character to the corresponding uppercase character. In this example, ~@( is used to cause the first word produced by ~@R to be capitalized: (format nil "~@R ~(~@R~)" 14 14) => "XIV xiv" (defun f (n) (format nil "~@(~R~) error~:P detected." n)) (f 0) => "Zero errors detected." (f 1) => "One error detected." (f 23) => "Twenty-three errors detected." ~[str0~;str1~;...~;strn~] Conditional expression. This is a set of control strings, called clauses, one of which is chosen and used. The clauses are separated by ~; and the construct is terminated by ~]. For example, "~[Siamese~;Manx~;Persian~] Cat" The argth clause is selected, where the first clause is number 0. If a prefix parameter is given (as ~n[), then the parameter is used instead of an argument. (This is useful only if the parameter is specified by #, to dispatch on the number of arguments remaining to be processed.) If arg is out of range, then no clause is selected (and no error is signaled). After the selected alternative has been processed, the control string continues after the ~]. ~[str0~;str1~;...~;strn~:;default~] has a default case. If the last ~; used to separate clauses is ~:; instead, then the last clause is an ``else'' clause that is performed if no other clause is selected. For example: "~[Siamese~;Manx~;Persian~:;Alley~] Cat" ~:[false~;true~] selects the false control string if arg is nil, and selects the true control string otherwise. ~@[true~] tests the argument. If it is not nil, then the argument is not used up by the ~@[ command but remains as the next one to be processed, and the one clause true is processed. If the arg is nil, then the argument is used up, and the clause is not processed. The clause therefore should normally use exactly one argument, and may expect it to be non-nil. For example: (setq *print-level* nil *print-length* 5) (format nil "~@[ print level = ~D~]~@[ print length = ~D~]" *print-level* *print-length*) => " print length = 5" The combination of ~[ and # is useful, for example, for dealing with English conventions for printing lists: (setq foo "Items:~#[ none~; ~S~; ~S and ~S~ ~:;~@{~#[~; and~] ~S~^,~}~].") (format nil foo) => "Items: none." (format nil foo 'foo) => "Items: FOO." (format nil foo 'foo 'bar) => "Items: FOO and BAR." (format nil foo 'foo 'bar 'baz) => "Items: FOO, BAR, and BAZ." (format nil foo 'foo 'bar 'baz 'quux) => "Items: FOO, BAR, BAZ, and QUUX." ~; This separates clauses in ~[ and ~< constructions. It is an error elsewhere. ~] This terminates a ~[. It is an error elsewhere. ~{str~} Iteration. This is an iteration construct. The argument should be a list, which is used as a set of arguments as if for a recursive call to format. The string str is used repeatedly as the control string. Each iteration can absorb as many elements of the list as it likes as arguments; if str uses up two arguments by itself, then two elements of the list will get used up each time around the loop. If before any iteration step the list is empty, then the iteration is terminated. Also, if a prefix parameter n is given, then there will be at most n repetitions of processing of str. Finally, the ~^ directive can be used to terminate the iteration prematurely. Here are some simple examples: (format nil "The winners are:~{ ~S~}." '(fred harry jill)) => "The winners are: FRED HARRY JILL." (format nil "Pairs:~{ <~S,~S>~}." '(a 1 b 2 c 3)) => "Pairs: <A,1> <B,2> <C,3>." ~:{str~} is similar, but the argument should be a list of sublists. At each repetition step, one sublist is used as the set of arguments for processing str; on the next repetition, a new sublist is used, whether or not all of the last sublist had been processed. Example: (format nil "Pairs:~:{ <~S,~S>~}." '((a 1) (b 2) (c 3))) => "Pairs: <A,1> <B,2> <C,3>." ~@{str~} is similar to ~{str~}, but instead of using one argument that is a list, all the remaining arguments are used as the list of arguments for the iteration. Example: (format nil "Pairs:~@{ <~S,~S>~}." 'a 1 'b 2 'c 3) => "Pairs: <A,1> <B,2> <C,3>." If the iteration is terminated before all the remaining arguments are consumed, then any arguments not processed by the iteration remain to be processed by any directives following the iteration construct. ~:@{str~} combines the features of ~:{str~} and ~@{str~}. All the remaining arguments are used, and each one must be a list. On each iteration, the next argument is used as a list of arguments to str. Example: (format nil "Pairs:~:@{ <~S,~S>~}." '(a 1) '(b 2) '(c 3)) => "Pairs: <A,1> <B,2> <C,3>." Terminating the repetition construct with ~:} instead of ~} forces str to be processed at least once, even if the initial list of arguments is null (however, it will not override an explicit prefix parameter of zero). If str is empty, then an argument is used as str. It must be a string and precede any arguments processed by the iteration. As an example, the following are equivalent: (apply #'format stream string arguments) (format stream "~1{~:}" string arguments) This will use string as a formatting string. The ~1{ says it will be processed at most once, and the ~:} says it will be processed at least once. Therefore it is processed exactly once, using arguments as the arguments. This case may be handled more clearly by the ~? directive, but this general feature of ~{ is more powerful than ~?. ~} This terminates a ~{. It is an error elsewhere. ~mincol,colinc,minpad,padchar<str~> Justification. This justifies the text produced by processing str within a field at least mincol columns wide. str may be divided up into segments with ~;, in which case the spacing is evenly divided between the text segments. With no modifiers, the leftmost text segment is left-justified in the field, and the rightmost text segment right-justified; if there is only one text element, as a special case, it is right-justified. The : modifier causes spacing to be introduced before the first text segment; the @ modifier causes spacing to be added after the last. The minpad parameter (default 0) is the minimum number of padding characters to be output between each segment. The padding character is specified by padchar, which defaults to the space character. If the total width needed to satisfy these constraints is greater than mincol, then the width used is mincol+k*colinc for the smallest possible non-negative integer value k; colinc defaults to 1, and mincol defaults to 0. (format nil "~10<foo~;bar~>") => "foo bar" (format nil "~10:<foo~;bar~>") => " foo bar" (format nil "~10:@<foo~;bar~>") => " foo bar " (format nil "~10<foobar~>") => " foobar" (format nil "~10:<foobar~>") => " foobar" (format nil "~10@<foobar~>") => "foobar " (format nil "~10:@<foobar~>") => " foobar " Note that str may include format directives. All the clauses in str are processed in order; it is the resulting pieces of text that are justified. The ~^ directive may be used to terminate processing of the clauses prematurely, in which case only the completely processed clauses are justified. If the first clause of a ~< is terminated with ~:; instead of ~;, then it is used in a special way. All of the clauses are processed (subject to ~^, of course), but the first one is not used in performing the spacing and padding. When the padded result has been determined, then if it will fit on the current line of output, it is output, and the text for the first clause is discarded. If, however, the padded text will not fit on the current line, then the text segment for the first clause is output before the padded text. The first clause ought to contain a newline (such as a ~% directive). The first clause is always processed, and so any arguments it refers to will be used; the decision is whether to use the resulting segment of text, not whether to process the first clause. If the ~:; has a prefix parameter n, then the padded text must fit on the current line with n character positions to spare to avoid outputting the first clause's text. For example, the control string "~%;; ~{~<~%;; ~1:; ~S~>~^,~}.~%" can be used to print a list of items separated by commas without breaking items over line boundaries, beginning each line with ;; . The prefix parameter 1 in ~1:; accounts for the width of the comma that will follow the justified item if it is not the last element in the list, or the period if it is. If ~:; has a second prefix parameter, then it is used as the width of the line, thus overriding the natural line width of the output stream. To make the preceding example use a line width of 50, one would write "~%;; ~{~<~%;; ~1,50:; ~S~>~^,~}.~%" If the second argument is not specified, then format uses the line width of the output stream. If this cannot be determined (for example, when producing a string result), then format uses 72 as the line length. ~> Terminates a ~<. It is an error elsewhere. [change_begin] X3J13 voted in June 1989 (PRETTY-PRINT-INTERFACE) to introduce certain format directives to support the user interface to the pretty printer. If ~:> is used to terminate a ~<... directive, the directive is equivalent to a call on pprint-logical-block. See section 27.4 for details. [change_end] ~^ Up and out. This is an escape construct. If there are no more arguments remaining to be processed, then the immediately enclosing ~{ or ~< construct is terminated. If there is no such enclosing construct, then the entire formatting operation is terminated. In the ~< case, the formatting is performed, but no more segments are processed before doing the justification. The ~^ should appear only at the beginning of a ~< clause, because it aborts the entire clause it appears in (as well as all following clauses). ~^ may appear anywhere in a ~{ construct. (setq donestr "Done.~^ ~D warning~:P.~^ ~D error~:P.") (format nil donestr) => "Done." (format nil donestr 3) => "Done. 3 warnings." (format nil donestr 1 5) => "Done. 1 warning. 5 errors." If a prefix parameter is given, then termination occurs if the parameter is zero. (Hence ~^ is equivalent to ~#^.) If two parameters are given, termination occurs if they are equal. If three parameters are given, termination occurs if the first is less than or equal to the second and the second is less than or equal to the third. Of course, this is useless if all the prefix parameters are constants; at least one of them should be a # or a V parameter. If ~^ is used within a ~:{ construct, then it merely terminates the current iteration step (because in the standard case it tests for remaining arguments of the current step only); the next iteration step commences immediately. To terminate the entire iteration process, use ~:^. [change_begin] X3J13 voted in March 1988 (FORMAT-COLON-UPARROW-SCOPE) to clarify the behavior of ~:^ as follows. It may be used only if the command it would terminate is ~:{ or ~:@{. The entire iteration process is terminated if and only if the sublist that is supplying the arguments for the current iteration step is the last sublist (in the case of terminating a ~:{ command) or the last argument to that call to format (in the case of terminating a ~:@{ command). Note furthermore that while ~^ is equivalent to ~#^ in all circumstances, ~:^ is not equivalent to ~:#^ because the latter terminates the entire iteration if and only if no arguments remain for the current iteration step (as opposed to no arguments remaining for the entire iteration process). Here are some examples of the differences in the behaviors of ~^, ~:^, and ~:#^. (format nil "~:{/~S~^ ...~}" '((hot dog) (hamburger) (ice cream) (french fries))) => "/HOT .../HAMBURGER/ICE .../FRENCH ..." For each sublist, `` ...'' appears after the first word unless there are no additional words. (format nil "~:{/~S~:^ ...~}" '((hot dog) (hamburger) (ice cream) (french fries))) => "/HOT .../HAMBURGER .../ICE .../FRENCH" For each sublist, `` ...'' always appears after the first word, unless it is the last sublist, in which case the entire iteration is terminated. (format nil "~:{/~S~:#^ ...~}" '((hot dog) (hamburger) (ice cream) (french fries))) => "/HOT .../HAMBURGER" For each sublist, `` ...'' appears after the first word, but if the sublist has only one word then the entire iteration is terminated. [change_end] If ~^ appears within a control string being processed under the control of a ~? directive, but not within any ~{ or ~< construct within that string, then the string being processed will be terminated, thereby ending processing of the ~? directive. Processing then continues within the string containing the ~? directive at the point following that directive. If ~^ appears within a ~[ or ~( construct, then all the commands up to the ~^ are properly selected or case-converted, the ~[ or ~( processing is terminated, and the outward search continues for a ~{ or ~< construct to be terminated. For example: (setq tellstr "~@(~@[~R~]~^ ~A.~)") (format nil tellstr 23) => "Twenty-three." (format nil tellstr nil "losers") => "Losers." (format nil tellstr 23 "losers") => "Twenty-three losers." Here are some examples of the use of ~^ within a ~< construct. (format nil "~15<~S~;~^~S~;~^~S~>" 'foo) => " FOO" (format nil "~15<~S~;~^~S~;~^~S~>" 'foo 'bar) => "FOO BAR" (format nil "~15<~S~;~^~S~;~^~S~>" 'foo 'bar 'baz) => "FOO BAR BAZ" [old_change_begin] ------------------------------------------------------------------------------- Compatibility note: The ~Q directive and user-defined directives of Zetalisp have been omitted here, as well as control lists (as opposed to strings), which are rumored to be changing in meaning. ------------------------------------------------------------------------------- [old_change_end] [change_begin] X3J13 voted in June 1989 (PRETTY-PRINT-INTERFACE) to introduce user-defined directives in the form of the ~/.../ directive. See section 27.4 for details. The hairiest format control string I have ever seen in shown in table 22-8. It started innocently enough as part of the simulator for Connection Machine Lisp [44,57]; the xapping data type, defined by defstruct, needed a :print-function option so that xappings would print properly. As this data type became more complicated, step by step, so did the format control string. ---------------------------------------------------------------- Table 22-8: Print Function for the Xapping Data Type (defun print-xapping (xapping stream depth) (declare (ignore depth)) (format stream ;; Are you ready for this one? "~:[{~;[~]~:{~S~:[->~S~;~*~]~:^ ~}~:[~; ~]~ ~{~S->~^ ~}~:[~; ~]~[~*~;->~S~;->~*~]~:[}~;]~]" ;; Is that clear? (xectorp xapping) (do ((vp (xectorp xapping)) (sp (finite-part-is-xetp xapping)) (d (xapping-domain xapping) (cdr d)) (r (xapping-range xapping) (cdr r)) (z '() (cons (list (if vp (car r) (car d)) (or vp sp) (car r)) z))) ((null d) (reverse z))) (and (xapping-domain xapping) (or (xapping-exceptions xapping) (xapping-infinite xapping))) (xapping-exceptions xapping) (and (xapping-exceptions xapping) (xapping-infinite xapping)) (ecase (xapping-infinite xapping) ((nil) 0) (:constant 1) (:universal 2)) (xapping-default xapping) (xectorp xapping))) See section 22.1.5 for the defstruct definition of the xapping data type, whose accessor functions are used in this code. ---------------------------------------------------------------- See the description of set-macro-character for a discussion of xappings and the defstruct definition. Assume that the predicate xectorp is true of a xapping if it is a xector, and that the predicate finite-part-is-xetp is true if every value in the range is the same as its corresponding index. Here is a blow-by-blow description of the parts of this format string: ~:[{~;[~] Print ``['' for a xector, and ``{'' otherwise. ~:{~S~:[->~S~;~*~]~:^ ~} Given a list of lists, print the pairs. Each sublist has three elements: the index (or the value if we're printing a xector); a flag that is true for either a xector or xet (in which case no arrow is printed); and the value. Note the use of ~:{ to iterate, and the use of ~:^ to avoid printing a separating space after the final pair (or at all, if there are no pairs). ~:[~; ~] If there were pairs and there are exceptions or an infinite part, print a separating space. ~ Do nothing. This merely allows the format control string to be broken across two lines. ~{~S->~^ ~} Given a list of exception indices, print them. Note the use of ~{ to iterate, and the use of ~^ to avoid printing a separating space after the final exception (or at all, if there are no exceptions). ~:[~; ~] If there were exceptions and there is an infinite part, print a separating space. ~[~*~;->~S~;->~*~] Use ~[ to choose one of three cases for printing the infinite part. ~:[}~;]~] Print ``]'' for a xector, and ``}'' otherwise. [change_end] ------------------------------------------------------------------------------- 22.4. Querying the User The following functions provide a convenient and consistent interface for asking questions of the user. Questions are printed and the answers are read using the stream *query-io*, which normally is synonymous with *terminal-io* but can be rebound to another stream for special applications. [Function] y-or-n-p &optional format-string &rest arguments This predicate is for asking the user a question whose answer is either ``yes'' or ``no.'' It types out a message (if supplied), reads an answer in some implementation-dependent manner (intended to be short and simple, like reading a single character such as Y or N), and is true if the answer was ``yes'' or false if the answer was ``no.'' If the format-string argument is supplied and not nil, then a fresh-line operation is performed; then a message is printed as if the format-string and arguments were given to format. Otherwise it is assumed that any message has already been printed by other means. If you want a question mark at the end of the message, you must put it there yourself; y-or-n-p will not add it. However, the message should not contain an explanatory note such as (Y or N), because the nature of the interface provided for y-or-n-p by a given implementation might not involve typing a character on a keyboard; y-or-n-p will provide such a note if appropriate. All input and output are performed using the stream in the global variable *query-io*. Here are some examples of the use of y-or-n-p: (y-or-n-p "Produce listing file?") (y-or-n-p "Cannot connect to network host ~S. Retry?" host) y-or-n-p should only be used for questions that the user knows are coming or in situations where the user is known to be waiting for a response of some kind. If the user is unlikely to anticipate the question, or if the consequences of the answer might be grave and irreparable, then y-or-n-p should not be used because the user might type ahead and thereby accidentally answer the question. For such questions as ``Shall I delete all of your files?'' it is better to use yes-or-no-p. [Function] yes-or-no-p &optional format-string &rest arguments This predicate, like y-or-n-p, is for asking the user a question whose answer is either ``yes'' or ``no.'' It types out a message (if supplied), attracts the user's attention (for example, by ringing the terminal's bell), and reads a reply in some implementation-dependent manner. It is intended that the reply require the user to take more action than just a single keystroke, such as typing the full word yes or no followed by a newline. If the format-string argument is supplied and not nil, then a fresh-line operation is performed; then a message is printed as if the format-string and arguments were given to format. Otherwise it is assumed that any message has already been printed by other means. If you want a question mark at the end of the message, you must put it there yourself; yes-or-no-p will not add it. However, the message should not contain an explanatory note such as (Yes or No) because the nature of the interface provided for yes-or-no-p by a given implementation might not involve typing the reply on a keyboard; yes-or-no-p will provide such a note if appropriate. All input and output are performed using the stream in the global variable *query-io*. To allow the user to answer a yes-or-no question with a single character, use y-or-n-p. yes-or-no-p should be used for unanticipated or momentous questions; this is why it attracts attention and why it requires a multiple-action sequence to answer it. -------------------------------------------------------------------------------